Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build reference for consensus cell type labels #973

Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 21 additions & 2 deletions analyses/cell-type-consensus/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,30 @@ Specifically, the cell type annotations obtained from both `SingleR` and `CellAs

## Description

TBD
The goal of this module is to create a reference that can be used to define an ontology aware consensus cell type label for all cells across all ScPCA samples.
This module performs a series of steps to accomplish that goal:

1. The cell type annotations present in the `PanglaoDB` reference file were assigned to an ontology term identifier, when possible.
See [`references/README.md`](./references/README.md) for a full description on how we completed assignments.
2. We looked at all possible combinations of cell type labels between the `PanglaoDB` reference (used with `CellAssign`) and the `BlueprintEncodeData` reference (used with `SingleR`).
We then explored using a set of rules used to define consensus cell types in [`exploratory-notebooks/01-reference-exploration.Rmd`](./exploratory-notebooks/01-reference-exploration.Rmd).
3. We created a [reference table](./references/consensus-cell-type-reference.tsv) containing all combinations for which we were able to identify a consensus cell type label.
The consensus cell type label corresponds to the [latest common ancestor (LCA)](https://rdrr.io/bioc/ontoProc/man/findCommonAncestors.html) between the `PanglaoDB` and `BlueprintEncodeData` terms.

When creating the consensus cell type labels we implemented the following rules:

- If the terms share more than 1 LCA, no consensus label is set.
The only exception is if one of the LCA terms corresponds to `hematopoietic precursor cells`.
If that is the case all other LCA terms are removed and `hematopoietic precursor cell` is used as the consensus label.
- If the LCA has greater than 170 descendants, no consensus label is set, with some exceptions:
- When the LCA is `neuron`, `neuron` is used as the consensus label.
- When the LCA is `epithelial cell` and the annotation from `BlueprintEncodeData` is `Epithelial cells`, then `epithelial cell` is used as the consensus label.
- If the LCA is `bone cell`, `lining cell`, `blood cell`, `progenitor cell`, or `supporting cell`, no consensus label is defined.


## Usage

TBD
See the [`scripts/README.md`](./scripts/README.md) for instructions on running the scripts in this module.

## Input files

Expand Down
16 changes: 16 additions & 0 deletions analyses/cell-type-consensus/references/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,19 @@ There were no terms that encompassed both other than `progenitor cell`.
Monocytes differentiate into mononuclear osteoclasts which are then activated and become multinucleated osteoclasts.
Because monocytes are the "precursor" to the differentiated osteoclast, we chose to use this term.
- `NA` was used for `Undefined placental cells` and `Transient cells` as no clear cell type from the cell ontology was identified.

2. `consensus-cell-type-reference.tsv`: This file contains a table with all cell type combinations between the `PanglaoDB` reference and `BlueprintEncodeData` reference for which a consensus cell type is identified.

The table includes the following columns:

| | |
| --- | --- |
| `panglao_ontology` | Cell type ontology term for `PanglaoDB` cell type |
| `panglao_annotation` | Original name for the cell type as set by `PanglaoDB` |
| `blueprint_ontology` | Cell type ontology term for `BlueprintEncodeData` cell type |
| `blueprint_annotation_main` | Original name for the cell type as set by `BlueprintEncodeData` (main term) |
| `blueprint_annotation_fine` | Original name for the cell type as set by `BlueprintEncodeData` (fine term) |
| `consensus_ontology` | Cell type ontology term for consensus cell type |
| `consensus_annotation` | Human readable name for the consensus cell type |

This file was generated by running [`scripts/02-prepare-consensus-reference.R`](../scripts/02-prepare-consensus-reference.R).
Loading
Loading