Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset registry: format #1

Open
peterdesmet opened this issue Nov 13, 2024 · 2 comments
Open

Dataset registry: format #1

peterdesmet opened this issue Nov 13, 2024 · 2 comments

Comments

@peterdesmet
Copy link
Member

peterdesmet commented Nov 13, 2024

Format at the HiRAD 2024-11 meeting for a datasets.json registry:

[
  {
    "name": "baltrad_vpts",
    "title": "BALTRAD_VPTS - ...",
    "description": "...",
    "contributors": [
      {
        "givenName": "Desmet",
        "familyName": "Peter",
        "orcid": "0000-0002-8442-8025",
        "role": "contact",
        "email": "[email protected]"
      }
    ],
    "keywords": ["animal movement", "birds"],
    "access": "open",
    "doi": "10.5281/zenodo.13683294",
    "homepage": "https://aloftdata.eu/browse/",
    "radarType": "weather",
    "dataType": "vpts",
    "deployments": [
      {
        "radar": "bejab",
        "start": "2020-01-03",
        "end": "2024-03-21",
        "latitude": 51.1919,
        "longitude": 3.0641,
        "altitude": 3,
        "firmwareVersion": null,
        "comment": null
      }
    ]
  }
]
@peterdesmet
Copy link
Member Author

Ping @bart1

@bart1
Copy link

bart1 commented Dec 17, 2024

In Amsterdam we discussed the dataset format here are some notes, thoughts, remarks from the meeting:

Multiple outputs from one radar

There are cases where long timeseries from a radar that is not open. The dataset might be still be published with "access": "closed". From this radar subsets of data might be published "open"`, this can either be derived metrics or temporal subsets. It would be good if datasets can point to each other so it is clear an open published dataset is part of a larger closed dataset. Similar questions arise if multiple datasets exist from one deployment (e.g. volume data and vpts or vpts and vertically integrated timeseries, should we also encode this?).

Description of a radar

There was a discussion on what would be the best way to describe the radar. It seems some information is missing like radarManifacturer, radarModelName and maybe other metrics that help searching data. Examples of these metrics might be the radar frequency and a indication of range (both horizontal and vertical). The later might be somewhat tricky as there can be quite a difference between the theoretical range and the useful range in practice. However some indication might already be helpful for visualization and search.

Smaller remarks

  • Maybe homepage is better called url
  • Currently the units/definitions are unclear like altitude above what.
  • As deployments can be quite substantial should there be an option to add an url there?
  • Maybe it is good to look how a elaborate version would look and what are the minimal fields for a meaningful entry (i.e. what are the optional fields)?
  • Is radar type a controlled list? What are the possible values? For example are "tracking radars" separated from "vertical radars"?
  • Is access in controlled list, are the options "open", "closed" and "embargo"?
  • Should information on data type be included, for example text format vs database?

Example datasets

Maja has some examples of already published datasets that we could use so see how the description works in practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants