Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data ES-CT #269

Closed
wants to merge 13 commits into from
Closed

Data ES-CT #269

wants to merge 13 commits into from

Conversation

marilinapisani
Copy link

Added samples to folder

@matyaskopp matyaskopp changed the base branch from main to data June 30, 2022 07:51
@matyaskopp matyaskopp changed the title Data Data ES-CT Jun 30, 2022
@matyaskopp
Copy link
Collaborator

@marilinapisani, thanks for the pull request.
I have changed the target branch to 'data'. You are supposed to make pull requests into the data branch.

If you need any explanation of errors in the output: https://github.com/clarin-eric/ParlaMint/runs/7127336080?check_suite_focus=true#step:4:27
don't hesitate to ask. The questions can help me to make output messages better.

@marilinapisani
Copy link
Author

@matyaskopp, thanks for your comment. For the next pull, where am I supposed to change the target branch? I followed the instructions in the Contributing file. Sorry if this is a basic question, this is my first commit.

Regarding the error logs, I am reviewing them and so far they seem clear to me.

@matyaskopp
Copy link
Collaborator

For the next pull, where am I supposed to change the target branch?

yes, the target branch has to be data.
But now you can just push new commits into your repository(data branch) and it will be automatically synchronized and validated in this pull request.

@marilinapisani
Copy link
Author

perfect!

@marilinapisani
Copy link
Author

marilinapisani commented Jul 4, 2022

@matyaskopp I do have doubts regarding this error message: "New document or paragraph starts when the last token of the previous sentence says SpaceAfter=No." Could you kindly clarify it for me? Thanks!

@matyaskopp
Copy link
Collaborator

@matyaskopp I do have doubts regarding this error message: "New document or paragraph starts when the last token of the previous sentence says SpaceAfter=No." Could you kindly clarify it for me? Thanks!

This error is reported by this tool https://github.com/UniversalDependencies/tools/blob/master/validate.py. It uses the wrong id in the report:

https://github.com/clarin-eric/ParlaMint/runs/7127336080?check_suite_focus=true#step:4:189

[Line 217 Sent ParlaMint-ES-CT_2015-11-09.u.1.4.1]: [L2 Metadata spaceafter-newdocpar] New document or paragraph starts when the last token of the previous sentence says SpaceAfter=No.

the issue is in the following sentence (ParlaMint-ES-CT_2015-11-09.u.1.4.2):

    <pc ana="mte:PUNCT" join="right" msd="UPosTag=PUNCT" xml:id="ParlaMint-ES-CT_2015-11-09.u.1.4.2.92">.</pc>
    <linkGrp targFunc="head argument" type="UD-SYN"> ... </lingGrp>
    </s>
</seg>

The last token in the sentence(ParlaMint-ES-CT_2015-11-09.u.1.4.2.92) contains join="right" which is not allowed because the paragraphs should be separated.

All last paragraph tokens in your data seem to contain join="right".

@marilinapisani
Copy link
Author

That's very helpful, thanks!

@TomazErjavec
Copy link
Collaborator

Can this pull request be closed (without merging)?

@matyaskopp matyaskopp closed this Aug 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants