From 281da372f7a5417079f2d28f6c44d189ca6c428e Mon Sep 17 00:00:00 2001 From: Tom Narock Date: Fri, 19 Oct 2018 12:26:04 -0400 Subject: [PATCH] Updated README.md with details on layout of log files --- README.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/README.md b/README.md index da7fe86..8a7e28e 100644 --- a/README.md +++ b/README.md @@ -12,3 +12,12 @@ where BLAH should be replaced with your token. The code will import api_token.py Overview --------- This code creates two log files, downloads available preprints, and creates a text file for each preprint containing the preprint abstract. One log file contains metadata on available preprints, such as authors, keywords, title, publication date, etc. The other log file contains metadata on peer-reviewed papers associated with preprints. This log file contains similar metadata as that of the preprints: e.g. authors, title, publication date, journal, etc. The EarthArXiv preprint identifier is used in both log files and provides a means a linking preprints to their peer-reviewed papers. + +Semi-colon is used as the seperator in each of the log files. Paper titles often have commas in them and using semi-colons to seperate columns allows us to preserve the titles. Preprint log has the form: + +identifier; preprint provider; preprint doi; peer review doi; preprint publication date; peer review publication date; title; author list; keyword list + +Peer-review log has the form: + +preprint identifier; preprint provider; peer review doi; peer review date published; peer review journal; peer review title; peer review author list; peer review publisher; peer review url +