Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harmonize metric and tests according to published suggestions - F #547

Open
huberrob opened this issue Dec 13, 2024 · 0 comments
Open

Harmonize metric and tests according to published suggestions - F #547

huberrob opened this issue Dec 13, 2024 · 0 comments

Comments

@huberrob
Copy link
Contributor

In particular, comments found in the in depth, comparative analysis of FAIR metrics published by:

In particular Candela et al. make very detailed suggestions.

According to these authors, the following improvements should be made:

F1:

  • In FsF-F1-01D as well as FsF-F1-01D it is confusing that the metric seems to relate to data while it actually deals with the data set as a whole.
  • Test FsF-F1-01D-1 (Identifier is resolvable and follows a defined unique identifier syntax ) is confusing since resolution rather relates to A1, further resultion here is only tested by syntax to check if e.g. a URL is used.
  • FsF-F1-02D-2 (Persistent identifier is resolvable) is related to A1 (resolution) and misleading since we aim to verify the existence of the PID, not the resolution
  • F1 PID tests additionally verify if these PIDs resolve to a domain/URL which is owned by the provider (to avoid fraud) Since this is a quality check it should not be scored.
  • FsF-F2-01M-1 Metadata has been made available via common web methods since this rather relates to A1 and is not scored in the current version anyway, the test should be skipped
  • FsF-F3-01M-1 Metadata contains data content related information (file name, size, type) is a duplicate since this data identifiers are tested in FsF-F3-01M-2 and file size and type are tested in FsF-R1-01M-2
  • FsF-F4-01M-1 Metadata is given in a way major search engines can ingest it for their catalogues (JSON-LD, Dublin Core, RDFa) is confusing since it seems to focus on the serialization format, it is however intended to test the 'search engine friendly way' to expose metadata.

The following changes should be made:

  • Rename these metrics to FsF-F1-01MD and FsF-F1-02MD and provide separate tests for data as well as metadata
  • Rename FsF-F1-01D-1 to FsF-F1-01MD-1: Metadata identifier follows a defined unique identifier syntax or scheme (IRI, URL, UUID or HASH)
  • Add a test which does the same for data
  • FsF-F1-02D-2: rename to FsF-F1-02MD-2 Persistent identifier for data is registered and maintained by a PID authority
  • FsF-F1-02MD-2: Instead of checking if it resolves it should be checked if the identifier is redirected (30x) this verifies it is maintained by a PID authority
  • F1 PID tests should be offered and performed but tested in the background and an appropriate message should be raised.
  • Skip/delete FsF-F2-01M-1
  • Skip/delete FsF-F3-01M-1
  • Rename FsF-F4-01M-1 Metadata is given in a way major search engines can ingest it for their catalogues (Google Dataset Search, Google, Bing etc. webmaster guidelines) which somehow clarifies this test is a combination of serialization format and metadata format
    Additionally:
  • FsF-F4-01M-2 Metadata is registered in major research data registries is strongly dependent on APIs and interfaces provided by catalogues. Since these are either not available (Google) or notoriously unavailable, this test is often unfair and should be skipped.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant