You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
given the attached file issue77.xml.txt ucto will create invalid folia: UIT.xml.text
The command was: ucto --passthru issue77.xml UIT.xml
>foliavalidator UIT.xml
VALIDATION ERROR on full parse by library (stage 2/3), in UIT.xml
ParseError: FoLiA exception in handling of <s> @ line 47 (in parent <p> @ parent line 44) : [DeclarationError] Processor ucto.1 is used for annotationtype SENTENCE, set None, but has no corresponding <annotator> referring to it from the annotations declaration block!
When using passthru, it is maybe not correct that ucto tries to assign a Sentence and Words to the second paragraph. @proycon wath should --passthru do here? The documentation states: Don't tokenize, but perform input decoding and simple token role detection
But a similar problem arises when we use ucto -Lnld issue77.xml UIT.xml
in that case ucto creates a new sentence with processor ucto1 but uses the old sentence-annotation form the input. It should add an extra sentence-annotation referring ucto.1
When the answer for 1. is: 'OK just add a sentence and a word' then the same would hold using the "passthru" set.
point 2 is (for now) resolved by 'adopting' the already present annotations. This produces correct FoLiA, but the question remains if this is the best solution.
Maybe we should reject such input. But there are use-cases where annotations are defined, (and sometimes NOT used at all).
We could also make ucto assign some own segmentation set for such cases. But this also has some troublesome consequences.
For now I suggest to stick with this half-baked solution. But feeling a bit worried.
given the attached file issue77.xml.txt ucto will create invalid folia: UIT.xml.text
The command was:
ucto --passthru issue77.xml UIT.xml
SIDENOTE: folialint doesn't complain added as LanguageMachines/libfolia#42
issue77.xml.txt
UIT.xml.txt
The text was updated successfully, but these errors were encountered: