Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/oai #30

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Feature/oai #30

wants to merge 10 commits into from

Conversation

J4bbi
Copy link
Collaborator

@J4bbi J4bbi commented Feb 7, 2025

Fix custom fields

Some custom fields incorrectly validated for mex identifiers, for example license which expects a URI. These fields no longer apply any validation.

A unit test has been added to specifically test custom fields.

Import data

The configuration variableIMPORT_LOG_FILE designates the path for the logs generated by imports. The configuration variableIMPORT_LOG_FORMAT configures the log format

When working on the OAI crosswalk it became apparent that the original way to use resource type and identifier ({resource_type}_{identifier}) to have a placeholder title value would be inconsistent for OAI.

MEx entity MEx property mandatory
access-platform title
activity title ✔️
bibliographic-resource title ✔️
contact-point email ✔️
distribution title ✔️
organization officialName ✔️
organizational-unit name ✔️
person fullName, familyName
resource title ✔️
variable label ✔️
variable-group label ✔️

The configuration variable RECORD_METADATA_TITLE_PROPERTIES is a list of MEx properties that will be used for Datacite title if they exist.

The configuration variable, RECORD_METADATA_DEFAULT_TITLE which is used for the Datacite title property if a title value can not be determined, that is to say if it is a person record with neither fullName nor familyName or an access-platform without a title.

@fdiehr and @cutoffthetop , this is of course up for discussion.

OAI

A custom OAI crosswalk has been created to convert the Invenio record metadata to OAI DC 2.0.

  • The custom field mex:description is mapped to dc:description.
  • The custom field mex:identifier is mapped to dc:identifier.
  • The configuration variable, OAISERVER_SOURCES contains a list of MEx properties that are mapped to dc:source.
  • The configuration variable, OAISERVER_RELATIONS contains a list of MEx properties that are mapped to dc:relation.
  • Because the default mapping is to map dc:rights to be info:eu-repo/semantics/closedAccess unless the Invenio record has selected an open license, the value info:eu-repo/semantics/openAccess is given
  • For consistency dc:type is set as info:eu-repo/semantics/other

The following is an example of a test record, via OAI.

<record>
<header>
<identifier>oai:mex.rki.de:5z0hz-aec24</identifier>
<datestamp>2025-02-06T09:53:39Z</datestamp>
</header>
<metadata>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:creator>The Robert Koch Institute</dc:creator>
<dc:date>2025-02-06</dc:date>
<dc:description>Good explain grow water plant perform resource. Security stock ball organization recognize civil.</dc:description>
<dc:description>Report purpose themselves all the nothing. Close ask reduce. Month maintain no sense this manager fine. Anyone state wind indeed nature white without.</dc:description>
<dc:identifier>oai:mex.rki.de:5z0hz-aec24</dc:identifier>
<dc:identifier>mex:hLFTLJ48LLgZqM11esym70</dc:identifier>
<dc:relation>mex:yAflS4smRTINBczZqqkrU</dc:relation>
<dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
<dc:title>[Untitled]</dc:title>
<dc:type>info:eu-repo/semantics/other</dc:type>
</oai_dc:dc>
</metadata>
</record>

It is possible to add further fields, such as modified date to dc:date. The crosswalk can easily be modified. This is a conversation we can have. This depends on how you expect the OAI endpoint to be used.

Unit tests

A Markdown documentation file has been added tests/TESTS.MD.

@fdiehr
Copy link

fdiehr commented Feb 10, 2025

Hi @J4bbi!

The configuration variable, RECORD_METADATA_DEFAULT_TITLE which is used for the Datacite title property if a title value can not be determined, that is to say if it is a person record with neither fullName nor familyName or an access-platform without a title.
RECORD_METADATA_DEFAULT_TITLE = "[Untitled]"

Looks good to me.

It is possible to add further fields, such as modified date to dc:date. The crosswalk can easily be modified. This is a conversation we can have. This depends on how you expect the OAI endpoint to be used.

Agreed, we have to evaluate, how important an OAI endpoint would be. Regarding health data, it currently does not seem to be a thing. But since we also have publications in our catalog, it still could be important. We will check this out and come back to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants