Harmonize metric and tests according to published suggestions - F #547

huberrob · 2024-12-13T14:24:13Z

In particular, comments found in the in depth, comparative analysis of FAIR metrics published by:

In particular Candela et al. make very detailed suggestions.

According to these authors, the following improvements should be made:

In FsF-F1-01D as well as FsF-F1-01D it is confusing that the metric seems to relate to data while it actually deals with the data set as a whole.
Test FsF-F1-01D-1 (Identifier is resolvable and follows a defined unique identifier syntax ) is confusing since resolution rather relates to A1, further resultion here is only tested by syntax to check if e.g. a URL is used.
FsF-F1-02D-2 (Persistent identifier is resolvable) is related to A1 (resolution) and misleading since we aim to verify the existence of the PID, not the resolution
F1 PID tests additionally verify if these PIDs resolve to a domain/URL which is owned by the provider (to avoid fraud) Since this is a quality check it should not be scored.
FsF-F2-01M-1 Metadata has been made available via common web methods since this rather relates to A1 and is not scored in the current version anyway, the test should be skipped
FsF-F3-01M-1 Metadata contains data content related information (file name, size, type) is a duplicate since this data identifiers are tested in FsF-F3-01M-2 and file size and type are tested in FsF-R1-01M-2
FsF-F4-01M-1 Metadata is given in a way major search engines can ingest it for their catalogues (JSON-LD, Dublin Core, RDFa) is confusing since it seems to focus on the serialization format, it is however intended to test the 'search engine friendly way' to expose metadata.

The following changes should be made:

Rename these metrics to FsF-F1-01MD and FsF-F1-02MD and provide separate tests for data as well as metadata
Rename FsF-F1-01D-1 to FsF-F1-01MD-1: Metadata identifier follows a defined unique identifier syntax or scheme (IRI, URL, UUID or HASH)
Add a test which does the same for data
FsF-F1-02D-2: rename to FsF-F1-02MD-2 Persistent identifier for data is registered and maintained by a PID authority
FsF-F1-02MD-2: Instead of checking if it resolves it should be checked if the identifier is redirected (30x) this verifies it is maintained by a PID authority
F1 PID tests should be offered and performed but tested in the background and an appropriate message should be raised.
Skip/delete FsF-F2-01M-1
Skip/delete FsF-F3-01M-1
Rename FsF-F4-01M-1 Metadata is given in a way major search engines can ingest it for their catalogues (Google Dataset Search, Google, Bing etc. webmaster guidelines) which somehow clarifies this test is a combination of serialization format and metadata format
Additionally:
FsF-F4-01M-2 Metadata is registered in major research data registries is strongly dependent on APIs and interfaces provided by catalogues. Since these are either not available (Google) or notoriously unavailable, this test is often unfair and should be skipped.

huberrob mentioned this issue Dec 18, 2024

Harmonize metric and tests according to published suggestions - A #548

Open

Provide feedback