Skip to content
This repository has been archived by the owner on Oct 5, 2021. It is now read-only.

can't use multiple tools with same tokenization #44

Open
fmof opened this issue Jun 4, 2014 · 0 comments
Open

can't use multiple tools with same tokenization #44

fmof opened this issue Jun 4, 2014 · 0 comments

Comments

@fmof
Copy link

fmof commented Jun 4, 2014

It's not possible right now to run different tools on the same tokenization. Specifically, the following fields need to be repeated:

  6: optional TokenTagging posTagList
  7: optional TokenTagging nerTagList
  8: optional TokenTagging lemmaList
  9: optional TokenTagging langIdList

  10: optional Parse parse
  11: optional list<DependencyParse> dependencyParseList

One option is to make fields 6-10 optional lists. This has the disadvantage of a number of breaking changes.

Alternatively, we could add fields 13-19* to represent "secondary" annotations. E.g.,

  13: optional list<TokenTagging> alternativePosTagList
  14: optional list<TokenTagging> alternativeNerTagList
  ....

(*Note that because dependencyParseList is a list, we don't need alternatives, though for consistency, we might want it.)

@vandurme @maxthomas @charman

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant