-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Develop 9. Next Steps 2: Named Entity Recognition #13
Comments
Based on #34 (comment) review project priority from 'Would like to have' to a 'Dev task'. |
Tested Standford NER, forked @brandontlocke's batch script, and added version that produces text files (line separated) for person markup and location markup respectively https://github.com/CatalogueLegacies/batchner This should be sufficient for the NER module, perhaps showing how the data is made (seen as the shell is beyond the scope of the lesson), then putting the data back into AntConc to examine word use around marked up entities. |
(NB: I ran out of talent when writing |
rough outline
|
On 3., @rossi-uk try:
|
|
Point of episode is about using AntConc to work with marked up text and/or third party tools. |
Info on better NER https://github.com/impresso/named-entity-tutorial-dh2019/tree/master/slides |
@rossi-uk I had a run through this afternoon and we are more or less there with this episode! Just needs a second task (which I think you said you'd make a start on, but do correct me if I'm wrong!) and a run through (because you know my ability to introduce stupid errors!) |
@rossi-uk We have an action to add a second task? Do we want to still to do that, or shall we close without it? |
I was thinking of a task around identifying mentions of women for example searching for Lady xxx and percentage of total references to women recognised by the NER. Considering the omissions. AntConc is struggling with processing queries on my machine.
omissions by NER-Note NER patterns: it recognises names when title is followed by initial: Miss C./PERSON Frere/PERSON; Mrs H.C./PERSON Noyce/PERSON
Scarcity of adjectives with names? |
https://github.com/CatalogueLegacies/antconc.github.io/blob/gh-pages/_episodes/10-named-entity-recognition.md
Based on rough outline at #13 (comment) tasks are:
sh SCRIPT
. Output 'DATASET_people.txt' and 'DATASET_places.txt'The text was updated successfully, but these errors were encountered: