Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

“What is 802.11?” #113

Closed
progval opened this issue Feb 12, 2015 · 5 comments
Closed

“What is 802.11?” #113

progval opened this issue Feb 12, 2015 · 5 comments
Labels

Comments

@progval
Copy link
Member

progval commented Feb 12, 2015

parsed as (?,definition,802.11)

http://askplatyp.us/?lang=en&q=What+is+802.11%3F

@Ezibenroc
Copy link
Member

The problem come from the Stanford parser. This is even worse for “What is P=NP?”.
We could try to train it on such cases.
An other solution would be to do some preprocessing: if there is a part of the sentence with numbers or symbols like =, then handle it as if it was within quotations.

@yhamoudi
Copy link
Member

(Partially) Fixed in https://github.com/ProjetPP/PPP-QuestionParsing-Grammatical/tree/reverse_predicates (but the problem comes from the stanford parser)

Possible solution for such cases: apply the algorithm used with quotations:

  • identify "strange" entities (an entity = a sequence of letters without space, "strange" = contain a strange symbol : ., =, ...). Ex: 802.11, P=NP,...
  • replace them by a random string
  • parse with the stanford parser
  • replace the random string by the initial word

If someone wants to implement this, please do it in branch reverse_predicates in file preprocessingMerge.py.

@yhamoudi
Copy link
Member

See #118

@Ezibenroc
Copy link
Member

Ezibenroc commented Feb 15, 2015 via email

@yhamoudi
Copy link
Member

But this does not concern this branch. You should do one pull request per

I will remove the advice once the branch will be merged. It's just to avoid that someone implement the solution in master before, otherwise there will be a big conflict.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants