Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crossref export missing author information if givenName or familyName missing #6863

Closed
NateWr opened this issue Mar 15, 2021 · 17 comments
Closed
Assignees
Labels
Bug:1:Low A bug that does not have a severe consequence or affects a small number of users. Try Me This issue might be good for a new contributor. Can you help us?
Milestone

Comments

@NateWr
Copy link
Contributor

NateWr commented Mar 15, 2021

Describe the bug
The Crossref export filter will not include information about an author if the givenName or familyName is not present (see this line). However, only the familyName is required at the moment and not everyone has more than one name. If no givenName exists, the author's details will not be included.

To Reproduce
Steps to reproduce the behavior:

  1. Create and publish an article with an author that has only a familyName.
  2. Check the crossref export XML.

What application are you using?
OJS 3.3.0.x

Additional information
See the discussion about the Crossref schema.

@NateWr NateWr added the Bug:1:Low A bug that does not have a severe consequence or affects a small number of users. label Mar 15, 2021
@NateWr NateWr added this to the OJS/OMP/OPS 3.3.0-5 milestone Mar 15, 2021
@AhemNason
Copy link

AhemNason commented Mar 23, 2021

Just a note that this fails specifically because of the empty <givenName> element. If you remove the element instead of leaving it there, closed, it will validate.

Invalid against Crossref 4.4.2 Schema:

          <person_name contributor_role="author" sequence="additional" language="en">
            <given_name/>
            <surname>ZHANG</surname>
          </person_name>

Valid against Crossref 4.4.2 Schema:

          <person_name contributor_role="author" sequence="additional" language="en">
            <surname>ZHANG</surname>
          </person_name>

The justification in their schema documentation is as follows:

The surname of an author or editor. The surname, combined with <given_name>, forms the name of an author or editor. Whenever possible, the given name should not be included in the surname element. In cases where the given name is not clear, as may happen with non-Western names or some societies in which surnames are not distinguished, you may place the entire name in surname, e.g.: <surname>Leonardo da Vinci</surname>

If an author is an organization, you should use organization, not surname. Suffixes should be tagged with suffix. Author degrees (e.g. M.D., Ph.D.) should not be included in Crossref submissions.

Additionally, if the user somehow only filled out <given_name>, that will choke for similar reasons, except in the Crossref schema you cannot have only a "given name" with no surname. Crossref does have a <alt-name> element. I don't know if I consider it a 1:1 for our "preferred name" field.

If I had a make a recommendation, it would be the following:

  1. closed tags shouldn't be written to the Crossref xml export.
  2. if only the preferred name is available, write to <surname> element
  3. if only the "given name" is available, write to <surname> element
  4. if we have preferred name, and full name metadata, write preferred name to <surname> element and full name, concatenated first last in the <alt-name> element.
  5. create toggle for corporate authorship on the preferred name field that replaces <persname> parent and child elements with <organization>. The attributes used in persname would be identical.

@asmecher asmecher modified the milestones: 3.3.0-8, 3.3.0-9 Aug 26, 2021
@NateWr NateWr added the Try Me This issue might be good for a new contributor. Can you help us? label Jan 20, 2022
@NateWr
Copy link
Contributor Author

NateWr commented Jan 20, 2022

See also #7528.

@mpbraendle
Copy link
Contributor

The Crossref schema defines an anonymous element which can be used in such case.

@AhemNason
Copy link

The Crossref schema defines an anonymous element which can be used in such case.

Please see #5955

@asmecher asmecher modified the milestones: 3.3.0-9, 3.3.0-10 Mar 1, 2022
@NateWr NateWr moved this to Backlog in Metadata and Distribution May 9, 2022
@NateWr NateWr moved this from Backlog to Todo in Metadata and Distribution May 9, 2022
@asmecher asmecher modified the milestones: 3.3.0-11, 3.3.0-12 Jun 7, 2022
@bozana
Copy link
Collaborator

bozana commented Aug 24, 2022

I think this issue is not relevant any more:
The given name (at least) in the submission locale must exist. If there is no family name, the given name is exported in the surname element (there is no empty given_name element any more.
Thus, closing...

@bozana bozana closed this as completed Aug 24, 2022
Repository owner moved this from Todo to Done in Metadata and Distribution Aug 24, 2022
@bozana
Copy link
Collaborator

bozana commented Oct 19, 2022

@AhemNason, because your comment is not shown here, I will post it:

Hey there, I was just revisiting this. I wanted to flag that even though a given name is required by OJS, this isn't the case if someone has enabled the plugin that allows you to toggle off name/email metadata. It also is an issue if someone imports their content to OJS via article import. You can force metadata into OJS via CLI that wouldn't be allowed in the UI.

If there's no author, OJS should always just remove the piece as stated. I understand it's not likely to occur but it definitely still occurs enough that I'm writing a guide for how to fix the issue.

@bozana bozana reopened this Oct 19, 2022
Repository owner moved this from Done to Todo in Metadata and Distribution Oct 19, 2022
@bozana
Copy link
Collaborator

bozana commented Oct 19, 2022

Actually, those plugins are the places where the problem should be fixed, i.e.:

  1. What plugin does allow you to toggle off name/email metadata? That plugin should take care that the system logic is not broken (i.e. Crossref and all other exports function properly in those cases). Could you please file an issue there?
  2. Also the import of submissions without author's given name in the submission locale should not happen. I think this is a bug. Would it be possible to double check that or what plugin that is, and file an issue for that? (I will also test the native import plugin -- but according to the schema this should not be possible...)

If there's no author, OJS should always just remove the piece as stated.

How is it possible that there is no author? I think the feature is sill not implemented -- and once when it is implemented all places should/will be considered...

But if there are so many support cases we can eventually hack the Crossref plugin, although I do not like it (also what is
with other parts of the system that distribute the data) :-)

Can you please tell me how those corrupt data is looking? What exactly is there and what is missing?

@bozana
Copy link
Collaborator

bozana commented Oct 19, 2022

Regarding the 5 points from above:

  1. closed tags shouldn't be written to the Crossref xml export.
    

It is easier said that done: for example: we assume that there are some specific metadata there (because they are required), e.g. the given name in submission locale (or title, etc.) and we do not check for all of them if they exist. So we would need to always check if the value is there, and to have another solution if not. Now we will do it for the authors names, because it seems there are lots of corrupt data there...

  1. if only the preferred name is available, write to <surname> element
    

Can there be only the preferred name? How is this possible?

  1. if only the "given name" is available, write to <surname> element
    

Yes, this is done so.

  1. if we have preferred name, and full name metadata, write preferred name to <surname> element and full name, concatenated first last in the <alt-name> element.
    

I would like to leave this for later, considering it in a new issue... Here I would only like to try to fix the problems coming from the corrupt data...

  1. create toggle for corporate authorship on the preferred name field that replaces <persname> parent and child elements with <organization>. The attributes used in persname would be identical.
    

This should be done in a separate issue, here probably: #5955

@NateWr
Copy link
Contributor Author

NateWr commented Oct 19, 2022

The plugin that @AhemNason mentions is authorRequirements, written by @ewhanson, and I believe it only makes the email optional at this time.

@AhemNason
Copy link

AhemNason commented Oct 19, 2022

Sorry for all this, folks. I was working on documentation about this error for hosting support and remembered this ticket, and wrote this reply. I deleted it shortly thereafter because I didn't think it made sense in this ticket given all the stuff that's been sorted out since. I probably should have just wrote a "never mind" comment after. I sort of forgot about email (the dream).

I do, though, think it would be a mistake to "fix" the ability for people to put empty author tags up via CLI import. It would certainly make the lives of the folks on the hosting team accommodating migrations much worse. The fact remains that although author is a required field for us, that is not the case in other places where the author is not so expected/tethered to the publication. I made that case pretty well in #5955 , I think.

Michael & Co. pretty regularly get XML (or CSV) from migrating journals to OJS and having to ask them to fill out metadata that doesn't exist for issues that are long-since published is problematic.

In any case, you can see an example of the issue in the docs above. Basically, empty author fields result in:

<contributors />

An empty tag isn't facet valid in Crossref's schema. I tell clients to manually delete these empty fields and deposit the XML via the Crossref dashboard.

@bozana
Copy link
Collaborator

bozana commented Oct 19, 2022

@AhemNason, I can adapt the Crossref export so that no element 'contributors' is created if there are not authors. Would that solve the problem for now?

@AhemNason
Copy link

AhemNason commented Oct 19, 2022 via email

@ewhanson
Copy link
Collaborator

The plugin that @AhemNason mentions is authorRequirements, written by @ewhanson, and I believe it only makes the email optional at this time.

Yes, this is the case.

@bozana
Copy link
Collaborator

bozana commented Oct 19, 2022

Thanks a lot everyone!
I will then only make the change that looks if there is any author and if not the contributors element will not be created...

@bozana
Copy link
Collaborator

bozana commented Oct 26, 2022

I opened a new issue, s. #8372 -- so that title better fits the actual problem...
and will close this issue...

@bozana bozana changed the title Crossref export missing author data if no given name is specified https://github.com/pkp/pkp-lib/issues/8372 Oct 26, 2022
@asmecher
Copy link
Member

@bozana, should this issue be closed?

@bozana
Copy link
Collaborator

bozana commented Nov 11, 2022

Oh, yes, I apparently forgot... :-) Thanks!!!

@bozana bozana closed this as completed Nov 11, 2022
Repository owner moved this from Todo to Done in Metadata and Distribution Nov 11, 2022
@NateWr NateWr changed the title https://github.com/pkp/pkp-lib/issues/8372 Crossref export missing author information if givenName or familyName missing Nov 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug:1:Low A bug that does not have a severe consequence or affects a small number of users. Try Me This issue might be good for a new contributor. Can you help us?
Projects
Development

No branches or pull requests

7 participants