Skip to content

McTavishLab/AvesData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 

Repository files navigation

README:

Data stucture

Taxonomy:

This directory contains folders for each year of the Clements taxonomy.

  • Clements_2021
  • Clements_2022
  • Clements_2023

Each taxonomy_year directory contains:

  • OTT_crosswalk_YEAR.csv a CSV file mapping species from the Clements taxonomy for that taxonomy year to OpenTreeTaxonomy (OTT) identifiers (e.g. OTT_crosswalk_2021.csv). Also includes mapping from species in Clements to their Avibase id, and to the names for those taxa in the IOC World Bird List v14.1 , Birdlife/HBW 8, and the Howard and Moore v4 taxonomy"

  • Column names and contents

    • TAXON_ORDER <- from Clements Checklist csv
    • SPECIES_CODE <- from Clements Checklist csv
    • TAXON_CONCEPT_ID <- Avibase id imported from Clements Checklist csv
    • PRIMARY_COM_NAME <- from Clements Checklist csv
    • SCI_NAME <- from Clements Checklist csv
    • ORDER1 <- from Clements Checklist csv
    • FAMILY <- from Clements Checklist csv
    • ott_id <- id for that species in OTT
    • ott_name <- name for that species in OTT
    • ott_tax_sources <- sources for that taxon name in OTT
    • ott_match_type <- how link was made from Clements name to OTT id. (may be 'canonical_match', 'synonym_match', 'new_taxon_addition', or 'NA' for no match, as well some other idiosyncratic match types for hand corrections)
    • H_M_name <- best match name or names in the Howard and Moore v4 taxonomy(if multiple names matched they are a semi-colon seperated list)
    • H_M_match_type <- how the the avibase taxon concepts map between the Clements and Howard and Moore names. One of 'concepts_match', 'child_of', 'parent_of', 'overlaps', or 'missing'.
    • Birdlife_name <- best match name or names in the Birdlife/HBW taxonomy(if multiple names matched they are a semi-colon seperated list)
    • Birdlife_match_type <- how the the avibase taxon concepts map between Clements and Birdlife names
    • IOC_name <- best match name or names in the Birdlife/HBW taxonomy(if multiple names matched they are a semi-colon seperated list)
    • IOC_match_type <- how the the avibase taxon concepts map between Clements and IOC names
  • a taxon addition file capturing a suggested placement for species for which we do not yet have phylogenetic information

  • a copy of the ebird taxonomy file for that year.

Trees:

This directory contains folders for each synth tree run, labelled by their version identifier (e.g. Aves_1.0)

The most recent tree is Aves_1.3 and that directory contains all of these files. (Older trees folders have some but not all of these files, and have a readme inculded in each folder)

Aves_1.3

For more details and code for how each of these files were generated see: https://github.com/McTavishLab/AvesTreeCode

  • dates_citations.txt <-input studies used to estimate dates
  • tree_citations.txt <- input studies used to estimate the phylogeny tree
  • all_node_ages.json <- a json file storing the age estimates for each internal node in the phylogeny only tree, with the metadata about what input study suggested that node age.
  • OpenTree_synth <- folder containing the direct outputs of OpenTree Synthesis (this is identical between Aves 1.2 and Aves 1.3)
    • This folder contains many nitty gritty internal synthesis outputs. These files are detailed in depth in the index.html file contained in the OpenTree synth folder
  • A folder for each year of the Clements taxonomy folder:
    • phylo_only.tre <- OpenTree Id labeled tree including only tips with phylogenetic information
    • phylo_only_clements_labels.tre <- Clements labeled tree including only tips with phylogenetic information
    • phylo_only_clements_labels_ultrametric.tre <- ultrametric tree which is the input for the taxon addition step
    • taxon_addition_treeset.tre <- set of 100 complete trees from 100 stochastic taxon addition replicates
    • dated_rand_sample_clements.tre <- dated taxon addition tree cloud, 100 topologies from the taxon addition treeset using random selections from the node dates age for each dated node calibration. First line is a header line containing the text "trees", followed by one newick tree per line.
    • dated_mean_sample_clements.tre <- dated taxon addition tree cloud, 100 topologies from the taxon addition treeset using the mean node age for each node calibration. First line is a header line containing the text "trees", followed by one newick tree per line.
    • MCC_clements.tre <- Maximim clade credibility tree including all taxa in this version of the taxonomy summarized from the dated trees using random selections for the node calibrations. Labelled with Clements taxonomy labels.

About

Data deposit for Aves synthetic tree

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published