Skip to content

how to map data papers.yaml to catalog

Philip Leo Pascual edited this page Oct 17, 2023 · 9 revisions

How to map data-papers.yaml to catalog papers

The file, data-papers.yaml holds as many papers with CAIDA linked datasets as possible. Below is all possible values of this dataset, and how to map them catalog-data papers.

data-papers.yaml example

---
MARKER   : "YYYY_FistInitial_LastName_Venue_UniqueID"  # UniqueID is not always given.
TYPE     : "Paper's Location"                          # More context on this value below.
AUTHOR   : "Last, First; Last, First; ..."             # Each author's name seperated by either a `;` or `'`.
GEOLOC   : "Location_1; Location_2; ..."               # Only one location is listed if all authors are from there.
TITLE    : "Paper's Title"
YEAR     : "YYYY-MM"
TOPKEY   : "Dataset_1, Dataset_2, ..."                 # More context on this value in next section.

# The following values should be used to build the BIBTEX entry
SERIAL   : "Journal Name"                              # SERIAL and JOURNAL are always together.
VOLUME   : "Journal Volume"
CHAPTER  : "Book Chapter"
ARTICLE  : "Article Number"
PAGE     : "20-30" OR "(10 pages)"
CTITLE   : "Conference Title"
DOI      : "10.1007/978-3-642-36516-4_3"                # DOI or URL are usually not both given.
URL      : "example.com"
ABS      : "Abstract"
PLACE    : "Location and data of conference"
PUBLISH  : "Location of a publisher (university)"       # Very uncommon.
---

Rarely are all values given, the first seven are almost always given. Below are the available values for the key, TYPE: @DonaldWolfson examine the bibtex entries for existing papers https://catalog.caida.org/search?query=types=paper%20caida -> https://catalog.caida.org/search?query=types=paper%20caida -> https://www.caida.org/publications/papers/2021/impact_general_data_protection/bibtex.html

You can examine the bibtex fields

Labels for TOPKEY mapped to catalog-data datasets

The mappings originally were stored in the catalog-data/scripts/externallinks_placeholder.py script, but has been moved to the catalog-data/data/data-papers.keys yaml file.

The current version of the mappings can be found here
Note: The file is separated with --- to delineate multiple documents within it. The mappings are located in the second document

Datasets that will need to be created