Skip to content

Latest commit

 

History

History
17 lines (11 loc) · 789 Bytes

README.md

File metadata and controls

17 lines (11 loc) · 789 Bytes

Posos data challenge

Main goal

Given a set of questions about diverse pharmacology problems, predict the sub-domain (category) of question in order to clarify the intent of the query. Formally, the problem is a text classification problem where texts are a set of phrase and the categories are 52 mutually exclusive intentions as designed by professional pharmacologists.

## Useful links:

TODO

  • extraction des mots les plus représentatifs d'une catégorie
  • revoir le preprocess (notamment la correction orthographique)
  • rafiner le modèle avec les mots extraits (matrice des cosine puis convolutions)
  • nettoyer le workflow