Skip to content

Commit

Permalink
Merge pull request #973 from allyhawkins/allyhawkins/build-consensus-…
Browse files Browse the repository at this point in the history
…reference

Build reference for consensus cell type labels
  • Loading branch information
allyhawkins authored Jan 7, 2025
2 parents 0e8187c + 15c23b8 commit f82012b
Show file tree
Hide file tree
Showing 6 changed files with 472 additions and 4 deletions.
23 changes: 21 additions & 2 deletions analyses/cell-type-consensus/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,30 @@ Specifically, the cell type annotations obtained from both `SingleR` and `CellAs

## Description

TBD
The goal of this module is to create a reference that can be used to define an ontology aware consensus cell type label for all cells across all ScPCA samples.
This module performs a series of steps to accomplish that goal:

1. The cell type annotations present in the `PanglaoDB` reference file were assigned to an ontology term identifier, when possible.
See [`references/README.md`](./references/README.md) for a full description on how we completed assignments.
2. We looked at all possible combinations of cell type labels between the `PanglaoDB` reference (used with `CellAssign`) and the `BlueprintEncodeData` reference (used with `SingleR`).
We then explored using a set of rules used to define consensus cell types in [`exploratory-notebooks/01-reference-exploration.Rmd`](./exploratory-notebooks/01-reference-exploration.Rmd).
3. We created a [reference table](./references/consensus-cell-type-reference.tsv) containing all combinations for which we were able to identify a consensus cell type label.
The consensus cell type label corresponds to the [latest common ancestor (LCA)](https://rdrr.io/bioc/ontoProc/man/findCommonAncestors.html) between the `PanglaoDB` and `BlueprintEncodeData` terms.

When creating the consensus cell type labels we implemented the following rules:

- If the terms share more than 1 LCA, no consensus label is set.
The only exception is if one of the LCA terms corresponds to `hematopoietic precursor cells`.
If that is the case all other LCA terms are removed and `hematopoietic precursor cell` is used as the consensus label.
- If the LCA has greater than 170 descendants, no consensus label is set, with some exceptions:
- When the LCA is `neuron`, `neuron` is used as the consensus label.
- When the LCA is `epithelial cell` and the annotation from `BlueprintEncodeData` is `Epithelial cells`, then `epithelial cell` is used as the consensus label.
- If the LCA is `bone cell`, `lining cell`, `blood cell`, `progenitor cell`, or `supporting cell`, no consensus label is defined.


## Usage

TBD
See the [`scripts/README.md`](./scripts/README.md) for instructions on running the scripts in this module.

## Input files

Expand Down
16 changes: 16 additions & 0 deletions analyses/cell-type-consensus/references/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,19 @@ There were no terms that encompassed both other than `progenitor cell`.
Monocytes differentiate into mononuclear osteoclasts which are then activated and become multinucleated osteoclasts.
Because monocytes are the "precursor" to the differentiated osteoclast, we chose to use this term.
- `NA` was used for `Undefined placental cells` and `Transient cells` as no clear cell type from the cell ontology was identified.

2. `consensus-cell-type-reference.tsv`: This file contains a table with all cell type combinations between the `PanglaoDB` reference and `BlueprintEncodeData` reference for which a consensus cell type is identified.

The table includes the following columns:

| | |
| --- | --- |
| `panglao_ontology` | Cell type ontology term for `PanglaoDB` cell type |
| `panglao_annotation` | Original name for the cell type as set by `PanglaoDB` |
| `blueprint_ontology` | Cell type ontology term for `BlueprintEncodeData` cell type |
| `blueprint_annotation_main` | Original name for the cell type as set by `BlueprintEncodeData` (main term) |
| `blueprint_annotation_fine` | Original name for the cell type as set by `BlueprintEncodeData` (fine term) |
| `consensus_ontology` | Cell type ontology term for consensus cell type |
| `consensus_annotation` | Human readable name for the consensus cell type |

This file was generated by running [`scripts/02-prepare-consensus-reference.R`](../scripts/02-prepare-consensus-reference.R).
Loading

0 comments on commit f82012b

Please sign in to comment.