diff --git a/analyses/cell-type-consensus/README.md b/analyses/cell-type-consensus/README.md index 7b3ec3788..3346815b9 100644 --- a/analyses/cell-type-consensus/README.md +++ b/analyses/cell-type-consensus/README.md @@ -5,11 +5,30 @@ Specifically, the cell type annotations obtained from both `SingleR` and `CellAs ## Description -TBD +The goal of this module is to create a reference that can be used to define an ontology aware consensus cell type label for all cells across all ScPCA samples. +This module performs a series of steps to accomplish that goal: + +1. The cell type annotations present in the `PanglaoDB` reference file were assigned to an ontology term identifier, when possible. +See [`references/README.md`](./references/README.md) for a full description on how we completed assignments. +2. We looked at all possible combinations of cell type labels between the `PanglaoDB` reference (used with `CellAssign`) and the `BlueprintEncodeData` reference (used with `SingleR`). +We then explored using a set of rules used to define consensus cell types in [`exploratory-notebooks/01-reference-exploration.Rmd`](./exploratory-notebooks/01-reference-exploration.Rmd). +3. We created a [reference table](./references/consensus-cell-type-reference.tsv) containing all combinations for which we were able to identify a consensus cell type label. +The consensus cell type label corresponds to the [latest common ancestor (LCA)](https://rdrr.io/bioc/ontoProc/man/findCommonAncestors.html) between the `PanglaoDB` and `BlueprintEncodeData` terms. + +When creating the consensus cell type labels we implemented the following rules: + +- If the terms share more than 1 LCA, no consensus label is set. +The only exception is if one of the LCA terms corresponds to `hematopoietic precursor cells`. +If that is the case all other LCA terms are removed and `hematopoietic precursor cell` is used as the consensus label. +- If the LCA has greater than 170 descendants, no consensus label is set, with some exceptions: + - When the LCA is `neuron`, `neuron` is used as the consensus label. + - When the LCA is `epithelial cell` and the annotation from `BlueprintEncodeData` is `Epithelial cells`, then `epithelial cell` is used as the consensus label. + - If the LCA is `bone cell`, `lining cell`, `blood cell`, `progenitor cell`, or `supporting cell`, no consensus label is defined. + ## Usage -TBD +See the [`scripts/README.md`](./scripts/README.md) for instructions on running the scripts in this module. ## Input files diff --git a/analyses/cell-type-consensus/references/README.md b/analyses/cell-type-consensus/references/README.md index ec7e2495f..2a494a9f7 100644 --- a/analyses/cell-type-consensus/references/README.md +++ b/analyses/cell-type-consensus/references/README.md @@ -30,3 +30,19 @@ There were no terms that encompassed both other than `progenitor cell`. Monocytes differentiate into mononuclear osteoclasts which are then activated and become multinucleated osteoclasts. Because monocytes are the "precursor" to the differentiated osteoclast, we chose to use this term. - `NA` was used for `Undefined placental cells` and `Transient cells` as no clear cell type from the cell ontology was identified. + +2. `consensus-cell-type-reference.tsv`: This file contains a table with all cell type combinations between the `PanglaoDB` reference and `BlueprintEncodeData` reference for which a consensus cell type is identified. + +The table includes the following columns: + +| | | +| --- | --- | +| `panglao_ontology` | Cell type ontology term for `PanglaoDB` cell type | +| `panglao_annotation` | Original name for the cell type as set by `PanglaoDB` | +| `blueprint_ontology` | Cell type ontology term for `BlueprintEncodeData` cell type | +| `blueprint_annotation_main` | Original name for the cell type as set by `BlueprintEncodeData` (main term) | +| `blueprint_annotation_fine` | Original name for the cell type as set by `BlueprintEncodeData` (fine term) | +| `consensus_ontology` | Cell type ontology term for consensus cell type | +| `consensus_annotation` | Human readable name for the consensus cell type | + +This file was generated by running [`scripts/02-prepare-consensus-reference.R`](../scripts/02-prepare-consensus-reference.R). diff --git a/analyses/cell-type-consensus/references/consensus-cell-type-reference.tsv b/analyses/cell-type-consensus/references/consensus-cell-type-reference.tsv new file mode 100644 index 000000000..18003c3d8 --- /dev/null +++ b/analyses/cell-type-consensus/references/consensus-cell-type-reference.tsv @@ -0,0 +1,302 @@ +panglao_ontology panglao_annotation blueprint_ontology blueprint_annotation_main blueprint_annotation_fine consensus_ontology consensus_annotation +CL:0000583 alveolar macrophage CL:0000775 Neutrophils Neutrophils CL:0000766 myeloid leukocyte +CL:0000767 basophil CL:0000775 Neutrophils Neutrophils CL:0000094 granulocyte +CL:0000771 eosinophil CL:0000775 Neutrophils Neutrophils CL:0000094 granulocyte +CL:0000235 macrophage CL:0000775 Neutrophils Neutrophils CL:0000766 myeloid leukocyte +CL:0000097 mast cell CL:0000775 Neutrophils Neutrophils CL:0000766 myeloid leukocyte +CL:0000576 monocyte CL:0000775 Neutrophils Neutrophils CL:0000766 myeloid leukocyte +CL:0000775 neutrophil CL:0000775 Neutrophils Neutrophils CL:0000775 neutrophil +CL:0000092 osteoclast CL:0000775 Neutrophils Neutrophils CL:0000766 myeloid leukocyte +CL:0000091 Kupffer cell CL:0000775 Neutrophils Neutrophils CL:0000766 myeloid leukocyte +CL:0000453 Langerhans cell CL:0000775 Neutrophils Neutrophils CL:0000766 myeloid leukocyte +CL:0000129 microglial cell CL:0000775 Neutrophils Neutrophils CL:0000766 myeloid leukocyte +CL:0000889 myeloid suppressor cell CL:0000775 Neutrophils Neutrophils CL:0000766 myeloid leukocyte +CL:0000874 splenic red pulp macrophage CL:0000775 Neutrophils Neutrophils CL:0000766 myeloid leukocyte +CL:0000767 basophil CL:0000576 Monocytes Monocytes CL:0000766 myeloid leukocyte +CL:0000451 dendritic cell CL:0000576 Monocytes Monocytes CL:0000113 mononuclear phagocyte +CL:0000771 eosinophil CL:0000576 Monocytes Monocytes CL:0000766 myeloid leukocyte +CL:0000097 mast cell CL:0000576 Monocytes Monocytes CL:0000766 myeloid leukocyte +CL:0000576 monocyte CL:0000576 Monocytes Monocytes CL:0000576 monocyte +CL:0000775 neutrophil CL:0000576 Monocytes Monocytes CL:0000766 myeloid leukocyte +CL:0000784 plasmacytoid dendritic cell CL:0000576 Monocytes Monocytes CL:0000113 mononuclear phagocyte +CL:0000889 myeloid suppressor cell CL:0000576 Monocytes Monocytes CL:0000766 myeloid leukocyte +CL:0000037 hematopoietic stem cell CL:0000050 HSC MEP CL:0008001 hematopoietic precursor cell +CL:0000038 erythroid progenitor cell CL:0000050 HSC MEP CL:0008001 hematopoietic precursor cell +CL:0000084 T cell CL:0000624 CD4+ T-cells CD4+ T-cells CL:0000084 T cell +CL:0002038 T follicular helper cell CL:0000624 CD4+ T-cells CD4+ T-cells CL:0000624 CD4-positive, alpha-beta T cell +CL:0000893 thymocyte CL:0000624 CD4+ T-cells CD4+ T-cells CL:0000084 T cell +CL:0000798 gamma-delta T cell CL:0000624 CD4+ T-cells CD4+ T-cells CL:0000084 T cell +CL:0000814 mature NK T cell CL:0000624 CD4+ T-cells CD4+ T-cells CL:0000791 mature alpha-beta T cell +CL:0000898 naive T cell CL:0000624 CD4+ T-cells CD4+ T-cells CL:0002419 mature T cell +CL:0000910 cytotoxic T cell CL:0000624 CD4+ T-cells CD4+ T-cells CL:0002419 mature T cell +CL:0000912 helper T cell CL:0000624 CD4+ T-cells CD4+ T-cells CL:0002419 mature T cell +CL:0000813 memory T cell CL:0000624 CD4+ T-cells CD4+ T-cells CL:0002419 mature T cell +CL:0000815 regulatory T cell CL:0000624 CD4+ T-cells CD4+ T-cells CL:0002419 mature T cell +CL:0000084 T cell CL:0000815 CD4+ T-cells Tregs CL:0000084 T cell +CL:0002038 T follicular helper cell CL:0000815 CD4+ T-cells Tregs CL:0002419 mature T cell +CL:0000893 thymocyte CL:0000815 CD4+ T-cells Tregs CL:0000084 T cell +CL:0000798 gamma-delta T cell CL:0000815 CD4+ T-cells Tregs CL:0000084 T cell +CL:0000814 mature NK T cell CL:0000815 CD4+ T-cells Tregs CL:0002419 mature T cell +CL:0000898 naive T cell CL:0000815 CD4+ T-cells Tregs CL:0002419 mature T cell +CL:0000910 cytotoxic T cell CL:0000815 CD4+ T-cells Tregs CL:0002419 mature T cell +CL:0000912 helper T cell CL:0000815 CD4+ T-cells Tregs CL:0002419 mature T cell +CL:0000813 memory T cell CL:0000815 CD4+ T-cells Tregs CL:0002419 mature T cell +CL:0000815 regulatory T cell CL:0000815 CD4+ T-cells Tregs CL:0000815 regulatory T cell +CL:0000084 T cell CL:0000904 CD4+ T-cells CD4+ Tcm CL:0000084 T cell +CL:0002038 T follicular helper cell CL:0000904 CD4+ T-cells CD4+ Tcm CL:0000624 CD4-positive, alpha-beta T cell +CL:0000893 thymocyte CL:0000904 CD4+ T-cells CD4+ Tcm CL:0000084 T cell +CL:0000798 gamma-delta T cell CL:0000904 CD4+ T-cells CD4+ Tcm CL:0000084 T cell +CL:0000814 mature NK T cell CL:0000904 CD4+ T-cells CD4+ Tcm CL:0000791 mature alpha-beta T cell +CL:0000898 naive T cell CL:0000904 CD4+ T-cells CD4+ Tcm CL:0002419 mature T cell +CL:0000910 cytotoxic T cell CL:0000904 CD4+ T-cells CD4+ Tcm CL:0002419 mature T cell +CL:0000912 helper T cell CL:0000904 CD4+ T-cells CD4+ Tcm CL:0002419 mature T cell +CL:0000813 memory T cell CL:0000904 CD4+ T-cells CD4+ Tcm CL:0000813 memory T cell +CL:0000815 regulatory T cell CL:0000904 CD4+ T-cells CD4+ Tcm CL:0002419 mature T cell +CL:0000084 T cell CL:0000905 CD4+ T-cells CD4+ Tem CL:0000084 T cell +CL:0002038 T follicular helper cell CL:0000905 CD4+ T-cells CD4+ Tem CL:0000624 CD4-positive, alpha-beta T cell +CL:0000893 thymocyte CL:0000905 CD4+ T-cells CD4+ Tem CL:0000084 T cell +CL:0000798 gamma-delta T cell CL:0000905 CD4+ T-cells CD4+ Tem CL:0000084 T cell +CL:0000814 mature NK T cell CL:0000905 CD4+ T-cells CD4+ Tem CL:0000791 mature alpha-beta T cell +CL:0000898 naive T cell CL:0000905 CD4+ T-cells CD4+ Tem CL:0002419 mature T cell +CL:0000910 cytotoxic T cell CL:0000905 CD4+ T-cells CD4+ Tem CL:0002419 mature T cell +CL:0000912 helper T cell CL:0000905 CD4+ T-cells CD4+ Tem CL:0002419 mature T cell +CL:0000813 memory T cell CL:0000905 CD4+ T-cells CD4+ Tem CL:0000813 memory T cell +CL:0000815 regulatory T cell CL:0000905 CD4+ T-cells CD4+ Tem CL:0002419 mature T cell +CL:0000084 T cell CL:0000907 CD8+ T-cells CD8+ Tcm CL:0000084 T cell +CL:0002038 T follicular helper cell CL:0000907 CD8+ T-cells CD8+ Tcm CL:0000791 mature alpha-beta T cell +CL:0000893 thymocyte CL:0000907 CD8+ T-cells CD8+ Tcm CL:0000084 T cell +CL:0000798 gamma-delta T cell CL:0000907 CD8+ T-cells CD8+ Tcm CL:0000084 T cell +CL:0000814 mature NK T cell CL:0000907 CD8+ T-cells CD8+ Tcm CL:0000791 mature alpha-beta T cell +CL:0000898 naive T cell CL:0000907 CD8+ T-cells CD8+ Tcm CL:0002419 mature T cell +CL:0000910 cytotoxic T cell CL:0000907 CD8+ T-cells CD8+ Tcm CL:0002419 mature T cell +CL:0000912 helper T cell CL:0000907 CD8+ T-cells CD8+ Tcm CL:0002419 mature T cell +CL:0000813 memory T cell CL:0000907 CD8+ T-cells CD8+ Tcm CL:0000813 memory T cell +CL:0000815 regulatory T cell CL:0000907 CD8+ T-cells CD8+ Tcm CL:0002419 mature T cell +CL:0000084 T cell CL:0000913 CD8+ T-cells CD8+ Tem CL:0000084 T cell +CL:0002038 T follicular helper cell CL:0000913 CD8+ T-cells CD8+ Tem CL:0000791 mature alpha-beta T cell +CL:0000893 thymocyte CL:0000913 CD8+ T-cells CD8+ Tem CL:0000084 T cell +CL:0000798 gamma-delta T cell CL:0000913 CD8+ T-cells CD8+ Tem CL:0000084 T cell +CL:0000814 mature NK T cell CL:0000913 CD8+ T-cells CD8+ Tem CL:0000791 mature alpha-beta T cell +CL:0000898 naive T cell CL:0000913 CD8+ T-cells CD8+ Tem CL:0002419 mature T cell +CL:0000910 cytotoxic T cell CL:0000913 CD8+ T-cells CD8+ Tem CL:0002419 mature T cell +CL:0000912 helper T cell CL:0000913 CD8+ T-cells CD8+ Tem CL:0002419 mature T cell +CL:0000813 memory T cell CL:0000913 CD8+ T-cells CD8+ Tem CL:0000813 memory T cell +CL:0000815 regulatory T cell CL:0000913 CD8+ T-cells CD8+ Tem CL:0002419 mature T cell +CL:0000623 natural killer cell CL:0000623 NK cells NK cells CL:0000623 natural killer cell +CL:0001069 group 2 innate lymphoid cell CL:0000623 NK cells NK cells CL:0001065 innate lymphoid cell +CL:0000236 B cell CL:0000788 B-cells naive B-cells CL:0000236 B cell +CL:0000786 plasma cell CL:0000788 B-cells naive B-cells CL:0000945 lymphocyte of B lineage +CL:0000787 memory B cell CL:0000788 B-cells naive B-cells CL:0000785 mature B cell +CL:0000788 naive B cell CL:0000788 B-cells naive B-cells CL:0000788 naive B cell +CL:0000236 B cell CL:0000787 B-cells Memory B-cells CL:0000236 B cell +CL:0000786 plasma cell CL:0000787 B-cells Memory B-cells CL:0000945 lymphocyte of B lineage +CL:0000787 memory B cell CL:0000787 B-cells Memory B-cells CL:0000787 memory B cell +CL:0000788 naive B cell CL:0000787 B-cells Memory B-cells CL:0000785 mature B cell +CL:0000236 B cell CL:0000972 B-cells Class-switched memory B-cells CL:0000236 B cell +CL:0000786 plasma cell CL:0000972 B-cells Class-switched memory B-cells CL:0000945 lymphocyte of B lineage +CL:0000787 memory B cell CL:0000972 B-cells Class-switched memory B-cells CL:0000787 memory B cell +CL:0000788 naive B cell CL:0000972 B-cells Class-switched memory B-cells CL:0000785 mature B cell +CL:0000646 basal cell CL:0000037 HSC HSC CL:0000723 somatic stem cell +CL:0002322 embryonic stem cell CL:0000037 HSC HSC CL:0000034 stem cell +CL:0000352 epiblast cell CL:0000037 HSC HSC CL:0000723 somatic stem cell +CL:0000037 hematopoietic stem cell CL:0000037 HSC HSC CL:0000037 hematopoietic stem cell +CL:0005026 hepatoblast CL:0000037 HSC HSC CL:0000034 stem cell +CL:0002248 pluripotent stem cell CL:0000037 HSC HSC CL:0000723 somatic stem cell +CL:0002672 retinal progenitor cell CL:0000037 HSC HSC CL:0000034 stem cell +CL:0002664 cardioblast CL:0000037 HSC HSC CL:0000034 stem cell +CL:0002250 intestinal crypt stem cell CL:0000037 HSC HSC CL:0000034 stem cell +CL:0000038 erythroid progenitor cell CL:0000037 HSC HSC CL:0008001 hematopoietic precursor cell +CL:0000324 metanephric mesenchyme stem cell CL:0000037 HSC HSC CL:0000034 stem cell +CL:0000037 hematopoietic stem cell CL:0000837 HSC MPP CL:0008001 hematopoietic precursor cell +CL:0000038 erythroid progenitor cell CL:0000837 HSC MPP CL:0008001 hematopoietic precursor cell +CL:0000037 hematopoietic stem cell CL:0000051 HSC CLP CL:0008001 hematopoietic precursor cell +CL:0000038 erythroid progenitor cell CL:0000051 HSC CLP CL:0008001 hematopoietic precursor cell +CL:0000037 hematopoietic stem cell CL:0000557 HSC GMP CL:0008001 hematopoietic precursor cell +CL:0000038 erythroid progenitor cell CL:0000557 HSC GMP CL:0008001 hematopoietic precursor cell +CL:0000583 alveolar macrophage CL:0000235 Macrophages Macrophages CL:0000235 macrophage +CL:0000767 basophil CL:0000235 Macrophages Macrophages CL:0000766 myeloid leukocyte +CL:0000771 eosinophil CL:0000235 Macrophages Macrophages CL:0000766 myeloid leukocyte +CL:0000235 macrophage CL:0000235 Macrophages Macrophages CL:0000235 macrophage +CL:0000097 mast cell CL:0000235 Macrophages Macrophages CL:0000766 myeloid leukocyte +CL:0000775 neutrophil CL:0000235 Macrophages Macrophages CL:0000766 myeloid leukocyte +CL:0000091 Kupffer cell CL:0000235 Macrophages Macrophages CL:0000235 macrophage +CL:0000129 microglial cell CL:0000235 Macrophages Macrophages CL:0000235 macrophage +CL:0000889 myeloid suppressor cell CL:0000235 Macrophages Macrophages CL:0000766 myeloid leukocyte +CL:0000874 splenic red pulp macrophage CL:0000235 Macrophages Macrophages CL:0000235 macrophage +CL:0000084 T cell CL:0000625 CD8+ T-cells CD8+ T-cells CL:0000084 T cell +CL:0002038 T follicular helper cell CL:0000625 CD8+ T-cells CD8+ T-cells CL:0000791 mature alpha-beta T cell +CL:0000893 thymocyte CL:0000625 CD8+ T-cells CD8+ T-cells CL:0000084 T cell +CL:0000798 gamma-delta T cell CL:0000625 CD8+ T-cells CD8+ T-cells CL:0000084 T cell +CL:0000814 mature NK T cell CL:0000625 CD8+ T-cells CD8+ T-cells CL:0000791 mature alpha-beta T cell +CL:0000898 naive T cell CL:0000625 CD8+ T-cells CD8+ T-cells CL:0002419 mature T cell +CL:0000910 cytotoxic T cell CL:0000625 CD8+ T-cells CD8+ T-cells CL:0002419 mature T cell +CL:0000912 helper T cell CL:0000625 CD8+ T-cells CD8+ T-cells CL:0002419 mature T cell +CL:0000813 memory T cell CL:0000625 CD8+ T-cells CD8+ T-cells CL:0002419 mature T cell +CL:0000815 regulatory T cell CL:0000625 CD8+ T-cells CD8+ T-cells CL:0002419 mature T cell +CL:0000765 erythroblast CL:0000232 Erythrocytes Erythrocytes CL:0000764 erythroid lineage cell +CL:0000558 reticulocyte CL:0000232 Erythrocytes Erythrocytes CL:0000764 erythroid lineage cell +CL:0000038 erythroid progenitor cell CL:0000232 Erythrocytes Erythrocytes CL:0000764 erythroid lineage cell +CL:0000556 megakaryocyte CL:0000556 HSC Megakaryocytes CL:0000556 megakaryocyte +CL:0000037 hematopoietic stem cell CL:0000049 HSC CMP CL:0008001 hematopoietic precursor cell +CL:0000038 erythroid progenitor cell CL:0000049 HSC CMP CL:0008001 hematopoietic precursor cell +CL:0000583 alveolar macrophage CL:0000863 Macrophages Macrophages M1 CL:0000235 macrophage +CL:0000767 basophil CL:0000863 Macrophages Macrophages M1 CL:0000766 myeloid leukocyte +CL:0000771 eosinophil CL:0000863 Macrophages Macrophages M1 CL:0000766 myeloid leukocyte +CL:0000235 macrophage CL:0000863 Macrophages Macrophages M1 CL:0000235 macrophage +CL:0000097 mast cell CL:0000863 Macrophages Macrophages M1 CL:0000766 myeloid leukocyte +CL:0000775 neutrophil CL:0000863 Macrophages Macrophages M1 CL:0000766 myeloid leukocyte +CL:0000091 Kupffer cell CL:0000863 Macrophages Macrophages M1 CL:0000235 macrophage +CL:0000129 microglial cell CL:0000863 Macrophages Macrophages M1 CL:0000235 macrophage +CL:0000889 myeloid suppressor cell CL:0000863 Macrophages Macrophages M1 CL:0000766 myeloid leukocyte +CL:0000874 splenic red pulp macrophage CL:0000863 Macrophages Macrophages M1 CL:0000235 macrophage +CL:0000583 alveolar macrophage CL:0000890 Macrophages Macrophages M2 CL:0000235 macrophage +CL:0000767 basophil CL:0000890 Macrophages Macrophages M2 CL:0000766 myeloid leukocyte +CL:0000771 eosinophil CL:0000890 Macrophages Macrophages M2 CL:0000766 myeloid leukocyte +CL:0000235 macrophage CL:0000890 Macrophages Macrophages M2 CL:0000235 macrophage +CL:0000097 mast cell CL:0000890 Macrophages Macrophages M2 CL:0000766 myeloid leukocyte +CL:0000775 neutrophil CL:0000890 Macrophages Macrophages M2 CL:0000766 myeloid leukocyte +CL:0000091 Kupffer cell CL:0000890 Macrophages Macrophages M2 CL:0000235 macrophage +CL:0000129 microglial cell CL:0000890 Macrophages Macrophages M2 CL:0000235 macrophage +CL:0000889 myeloid suppressor cell CL:0000890 Macrophages Macrophages M2 CL:0000766 myeloid leukocyte +CL:0000874 splenic red pulp macrophage CL:0000890 Macrophages Macrophages M2 CL:0000235 macrophage +CL:0000115 endothelial cell CL:0000115 Endothelial cells Endothelial cells CL:0000115 endothelial cell +CL:0002544 aortic endothelial cell CL:0000115 Endothelial cells Endothelial cells CL:0000115 endothelial cell +CL:2000044 brain microvascular endothelial cell CL:0000115 Endothelial cells Endothelial cells CL:0000115 endothelial cell +CL:0000451 dendritic cell CL:0000451 DC DC CL:0000451 dendritic cell +CL:0000576 monocyte CL:0000451 DC DC CL:0000113 mononuclear phagocyte +CL:0000784 plasmacytoid dendritic cell CL:0000451 DC DC CL:0000451 dendritic cell +CL:0000453 Langerhans cell CL:0000451 DC DC CL:0000451 dendritic cell +CL:0000583 alveolar macrophage CL:0000771 Eosinophils Eosinophils CL:0000766 myeloid leukocyte +CL:0000767 basophil CL:0000771 Eosinophils Eosinophils CL:0000094 granulocyte +CL:0000771 eosinophil CL:0000771 Eosinophils Eosinophils CL:0000771 eosinophil +CL:0000235 macrophage CL:0000771 Eosinophils Eosinophils CL:0000766 myeloid leukocyte +CL:0000097 mast cell CL:0000771 Eosinophils Eosinophils CL:0000766 myeloid leukocyte +CL:0000576 monocyte CL:0000771 Eosinophils Eosinophils CL:0000766 myeloid leukocyte +CL:0000775 neutrophil CL:0000771 Eosinophils Eosinophils CL:0000094 granulocyte +CL:0000092 osteoclast CL:0000771 Eosinophils Eosinophils CL:0000766 myeloid leukocyte +CL:0000091 Kupffer cell CL:0000771 Eosinophils Eosinophils CL:0000766 myeloid leukocyte +CL:0000453 Langerhans cell CL:0000771 Eosinophils Eosinophils CL:0000766 myeloid leukocyte +CL:0000129 microglial cell CL:0000771 Eosinophils Eosinophils CL:0000766 myeloid leukocyte +CL:0000889 myeloid suppressor cell CL:0000771 Eosinophils Eosinophils CL:0000766 myeloid leukocyte +CL:0000874 splenic red pulp macrophage CL:0000771 Eosinophils Eosinophils CL:0000766 myeloid leukocyte +CL:0000236 B cell CL:0000786 B-cells Plasma cells CL:0000945 lymphocyte of B lineage +CL:0000786 plasma cell CL:0000786 B-cells Plasma cells CL:0000786 plasma cell +CL:0000787 memory B cell CL:0000786 B-cells Plasma cells CL:0000945 lymphocyte of B lineage +CL:0000788 naive B cell CL:0000786 B-cells Plasma cells CL:0000945 lymphocyte of B lineage +CL:0000138 chondrocyte CL:0000138 Chondrocytes Chondrocytes CL:0000138 chondrocyte +CL:2000002 decidual cell CL:0000138 Chondrocytes Chondrocytes CL:0000499 stromal cell +CL:0000632 hepatic stellate cell CL:0000138 Chondrocytes Chondrocytes CL:0000327 extracellular matrix secreting cell +CL:0000499 stromal cell CL:0000138 Chondrocytes Chondrocytes CL:0000499 stromal cell +CL:0000708 leptomeningeal cell CL:0000138 Chondrocytes Chondrocytes CL:0000327 extracellular matrix secreting cell +CL:0000057 fibroblast CL:0000057 Fibroblasts Fibroblasts CL:0000057 fibroblast +CL:0000632 hepatic stellate cell CL:0000057 Fibroblasts Fibroblasts CL:0000057 fibroblast +CL:0002410 pancreatic stellate cell CL:0000057 Fibroblasts Fibroblasts CL:0000057 fibroblast +CL:0002334 preadipocyte CL:0000057 Fibroblasts Fibroblasts CL:0000057 fibroblast +CL:0000192 smooth muscle cell CL:0000192 Smooth muscle Smooth muscle CL:0000192 smooth muscle cell +CL:0019019 tracheobronchial smooth muscle cell CL:0000192 Smooth muscle Smooth muscle CL:0000192 smooth muscle cell +CL:0000746 cardiac muscle cell CL:0000192 Smooth muscle Smooth muscle CL:0000187 muscle cell +CL:0000359 vascular associated smooth muscle cell CL:0000192 Smooth muscle Smooth muscle CL:0000192 smooth muscle cell +CL:0002068 Purkinje myocyte CL:0000192 Smooth muscle Smooth muscle CL:0000187 muscle cell +CL:0000622 acinar cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:1000488 cholangiocyte CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000166 chromaffin cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000584 enterocyte CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000164 enteroendocrine cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000065 ependymal cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000066 epithelial cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000160 goblet cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000501 granulosa cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000182 hepatocyte CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0005006 ionocyte CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000312 keratinocyte CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000077 mesothelial cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000185 myoepithelial cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000165 neuroendocrine cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002167 olfactory epithelial cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000510 paneth cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000162 parietal cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002481 peritubular myoid cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000652 pinealocyte CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000653 podocyte CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000209 taste receptor cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000731 urothelial cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002368 respiratory epithelial cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002370 respiratory goblet cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000171 pancreatic A cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000169 type B pancreatic cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000706 choroid plexus epithelial cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000158 club cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002250 intestinal crypt stem cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000173 pancreatic D cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002305 epithelial cell of distal tubule CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002079 pancreatic ductal cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000504 enterochromaffin-like cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0005019 pancreatic epsilon cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002258 thyroid follicular cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002179 foveolar cell of stomach CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000696 PP cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000155 peptic cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002292 type I cell of carotid body CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0005010 renal intercalated cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:1000909 kidney loop of Henle epithelial cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002326 luminal epithelial cell of mammary gland CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002327 mammary gland epithelial cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000242 Merkel cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000682 M cell of gut CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002199 oxyphil cell of parathyroid gland CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000446 chief cell of parathyroid gland CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0005009 renal principal cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002306 epithelial cell of proximal tubule CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002062 pulmonary alveolar type 1 cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002063 pulmonary alveolar type 2 cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:1001596 salivary gland glandular cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002140 acinar cell of sebaceous gland CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000216 Sertoli cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002562 hair germinal matrix cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0002204 brush cell CL:0000066 Epithelial cells Epithelial cells CL:0000066 epithelial cell +CL:0000148 melanocyte CL:0000148 Melanocytes Melanocytes CL:0000148 melanocyte +CL:0000594 skeletal muscle satellite cell CL:0000188 Skeletal muscle Skeletal muscle CL:0000188 cell of skeletal muscle +CL:0000166 chromaffin cell CL:0000312 Keratinocytes Keratinocytes CL:0002077 ecto-epithelial cell +CL:0000065 ependymal cell CL:0000312 Keratinocytes Keratinocytes CL:0002077 ecto-epithelial cell +CL:0000312 keratinocyte CL:0000312 Keratinocytes Keratinocytes CL:0000312 keratinocyte +CL:0000077 mesothelial cell CL:0000312 Keratinocytes Keratinocytes CL:0000076 squamous epithelial cell +CL:0000165 neuroendocrine cell CL:0000312 Keratinocytes Keratinocytes CL:0002077 ecto-epithelial cell +CL:0002167 olfactory epithelial cell CL:0000312 Keratinocytes Keratinocytes CL:0002077 ecto-epithelial cell +CL:0002481 peritubular myoid cell CL:0000312 Keratinocytes Keratinocytes CL:0000076 squamous epithelial cell +CL:0000652 pinealocyte CL:0000312 Keratinocytes Keratinocytes CL:0002077 ecto-epithelial cell +CL:0000706 choroid plexus epithelial cell CL:0000312 Keratinocytes Keratinocytes CL:0002077 ecto-epithelial cell +CL:0002292 type I cell of carotid body CL:0000312 Keratinocytes Keratinocytes CL:0002077 ecto-epithelial cell +CL:0000242 Merkel cell CL:0000312 Keratinocytes Keratinocytes CL:0000312 keratinocyte +CL:0002062 pulmonary alveolar type 1 cell CL:0000312 Keratinocytes Keratinocytes CL:0000076 squamous epithelial cell +CL:0002063 pulmonary alveolar type 2 cell CL:0000312 Keratinocytes Keratinocytes CL:0000076 squamous epithelial cell +CL:0002140 acinar cell of sebaceous gland CL:0000312 Keratinocytes Keratinocytes CL:0000362 epidermal cell +CL:0000216 Sertoli cell CL:0000312 Keratinocytes Keratinocytes CL:0000076 squamous epithelial cell +CL:0002562 hair germinal matrix cell CL:0000312 Keratinocytes Keratinocytes CL:0000362 epidermal cell +CL:0000115 endothelial cell CL:2000008 Endothelial cells mv Endothelial cells CL:0000115 endothelial cell +CL:0002544 aortic endothelial cell CL:2000008 Endothelial cells mv Endothelial cells CL:0000071 blood vessel endothelial cell +CL:2000044 brain microvascular endothelial cell CL:2000008 Endothelial cells mv Endothelial cells CL:2000008 microvascular endothelial cell +CL:0000192 smooth muscle cell CL:0000187 Myocytes Myocytes CL:0000187 muscle cell +CL:0019019 tracheobronchial smooth muscle cell CL:0000187 Myocytes Myocytes CL:0000187 muscle cell +CL:0000746 cardiac muscle cell CL:0000187 Myocytes Myocytes CL:0000187 muscle cell +CL:0000359 vascular associated smooth muscle cell CL:0000187 Myocytes Myocytes CL:0000187 muscle cell +CL:0002068 Purkinje myocyte CL:0000187 Myocytes Myocytes CL:0000187 muscle cell +CL:0000136 adipocyte CL:0000136 Adipocytes Adipocytes CL:0000136 adipocyte +CL:0000650 mesangial cell CL:0000669 Pericytes Pericytes CL:0000669 pericyte +CL:0000669 pericyte CL:0000669 Pericytes Pericytes CL:0000669 pericyte +CL:0000127 astrocyte CL:0000127 Astrocytes Astrocytes CL:0000127 astrocyte +CL:0000065 ependymal cell CL:0000127 Astrocytes Astrocytes CL:0000125 glial cell +CL:0000128 oligodendrocyte CL:0000127 Astrocytes Astrocytes CL:0000126 macroglial cell +CL:0002085 tanycyte CL:0000127 Astrocytes Astrocytes CL:0000127 astrocyte +CL:0000644 Bergmann glial cell CL:0000127 Astrocytes Astrocytes CL:0000127 astrocyte +CL:0000706 choroid plexus epithelial cell CL:0000127 Astrocytes Astrocytes CL:0000125 glial cell +CL:4040002 enteroglial cell CL:0000127 Astrocytes Astrocytes CL:0000125 glial cell +CL:4042021 neuronal-restricted precursor CL:0000127 Astrocytes Astrocytes CL:0000095 neuron associated cell +CL:0000242 Merkel cell CL:0000127 Astrocytes Astrocytes CL:0000095 neuron associated cell +CL:0000129 microglial cell CL:0000127 Astrocytes Astrocytes CL:0000125 glial cell +CL:0000636 Mueller cell CL:0000127 Astrocytes Astrocytes CL:0000125 glial cell +CL:0002453 oligodendrocyte precursor cell CL:0000127 Astrocytes Astrocytes CL:0000095 neuron associated cell +CL:0002573 Schwann cell CL:0000127 Astrocytes Astrocytes CL:0000125 glial cell +CL:0000681 radial glial cell CL:0000127 Astrocytes Astrocytes CL:0000125 glial cell +CL:0000516 perineural satellite cell CL:0000127 Astrocytes Astrocytes CL:0000095 neuron associated cell +CL:0000650 mesangial cell CL:0000650 Mesangial cells Mesangial cells CL:0000650 mesangial cell +CL:0000669 pericyte CL:0000650 Mesangial cells Mesangial cells CL:0000669 pericyte diff --git a/analyses/cell-type-consensus/references/panglao-cell-type-ontologies.tsv b/analyses/cell-type-consensus/references/panglao-cell-type-ontologies.tsv index e1c5665f9..4728238e8 100644 --- a/analyses/cell-type-consensus/references/panglao-cell-type-ontologies.tsv +++ b/analyses/cell-type-consensus/references/panglao-cell-type-ontologies.tsv @@ -137,7 +137,7 @@ CL:0000636 Mueller cell Müller cells CL:0000889 myeloid suppressor cell Myeloid-derived suppressor cells CL:0000746 cardiac muscle cell Myocytes CL:0000186 myofibroblast cell Myofibroblasts -CL:0000814 NK lymphocyte Natural killer T cells +CL:0000814 mature NK T cell Natural killer T cells CL:0000047 neural stem cell Neural stem/precursor cells CL:0000338 neuroblast (sensu Nematoda and Protostomia) Neuroblasts CL:0000623 natural killer cell NK cells diff --git a/analyses/cell-type-consensus/scripts/02-prepare-consensus-reference.R b/analyses/cell-type-consensus/scripts/02-prepare-consensus-reference.R new file mode 100644 index 000000000..4171ed363 --- /dev/null +++ b/analyses/cell-type-consensus/scripts/02-prepare-consensus-reference.R @@ -0,0 +1,127 @@ +#!/usr/bin/env Rscript + +# This script is used to create the reference table used for assigning consensus cell types +# the table will contain one row for each cell type combination between panglao and celldex +# where a consensus label was assigned + +# Paths ------------------------------------------------------------------------ +module_base <- rprojroot::find_root(rprojroot::is_renv_project) + +# cell ontology ref file +panglao_ref_file <- file.path(module_base, "references", "panglao-cell-type-ontologies.tsv") + +# output ref file +consensus_ref_file <- file.path(module_base, "references", "consensus-cell-type-reference.tsv") + +# Prep references -------------------------------------------------- + +# grab obo file +cl_ont <- ontologyIndex::get_ontology("http://purl.obolibrary.org/obo/cl/releases/2024-09-26/cl-basic.obo") + +# set up the graph to use for assigning LCA terms +parent_terms <- cl_ont$parents +cl_graph <- igraph::make_graph(rbind(unlist(parent_terms), rep(names(parent_terms), lengths(parent_terms)))) + +# read in panglao file +panglao_df <- readr::read_tsv(panglao_ref_file) |> + # rename columns to have panglao in them for easy joining later + dplyr::select( + panglao_ontology = "ontology_id", + panglao_annotation = "human_readable_value" + ) |> + # remove any cell types that don't have ontologies + tidyr::drop_na() + +# grab singler ref from celldex +blueprint_ref <- celldex::BlueprintEncodeData() + +# get ontologies and human readable name into data frame +blueprint_df <- data.frame( + blueprint_ontology = blueprint_ref$label.ont, + blueprint_annotation_main = blueprint_ref$label.main, + blueprint_annotation_fine = blueprint_ref$label.fine +) |> + unique() |> + tidyr::drop_na() + +# Get LCA and descendants ------------------------------------------------------ + +# get total descendants for each term in CL +# turn cl_ont into data frame with one row per term +cl_df <- data.frame( + cl_ontology = cl_ont$id, + cl_annotation = cl_ont$name +) |> + dplyr::rowwise() |> + dplyr::mutate( + descendants = list(ontologyIndex::get_descendants(cl_ont, cl_ontology, exclude_roots = TRUE)), + total_descendants = length(descendants) + ) + +# get a data frame with all combinations of panglao and blueprint terms +# one row for each combination +all_ref_df <- expand.grid( + panglao_df$panglao_ontology, + blueprint_df$blueprint_ontology +) |> + dplyr::rename( + panglao_ontology = "Var1", + blueprint_ontology = "Var2" + ) |> + # add in the human readable values for each ontology term + # account for ontologies showing up multiple times + dplyr::left_join(blueprint_df, by = "blueprint_ontology", relationship = "many-to-many") |> + dplyr::left_join(panglao_df, by = "panglao_ontology", relationship = "many-to-many") |> + unique() # only keep unique combinations + +# add lca and total number of descendants to data frame +# expand to have one row per unique combo + unique lca +# later we will remove the extra lca assignments so that there is only one row per combination and one consensus label +lca_df <- all_ref_df |> + dplyr::rowwise() |> + dplyr::mutate( + # least common shared ancestor + lca = list(rownames(ontoProc::findCommonAncestors(blueprint_ontology, panglao_ontology, g = cl_graph))) + ) |> + dplyr::mutate( + total_lca = length(lca), # get total number for filtering later + lca = paste0(lca, collapse = ",") # make it easier to split the lca terms + ) |> + # split each lca term into its own column + tidyr::separate(lca, into = c("lca_1", "lca_2", "lca_3"), sep = ",") |> + # transpose so that instead of lca being in a column there is one row per lca + tidyr::pivot_longer( + cols = dplyr::starts_with("lca"), + names_to = "lca_number", + values_to = "lca" + ) |> + tidyr::drop_na() |> + dplyr::select(-lca_number) |> # don't need this column + # account for any cases where the ontology IDs are exact matches + # r complains about doing this earlier since the lca column holds lists until now + dplyr::mutate(lca = dplyr::if_else(blueprint_ontology == panglao_ontology, blueprint_ontology, lca)) |> + # join in information for each of the lca terms including name and number of descendants + dplyr::left_join(cl_df, by = c("lca" = "cl_ontology")) + +# Set consensus labels --------------------------------------------------------- + +# get a table with only combinations that will have an assigned consensus label +consensus_labels_df <- lca_df |> + # everything with more than 1 lca gets removed with the exception of HSCs + dplyr::filter(total_lca <=1 | cl_annotation == "hematopoietic precursor cell") |> + # keep everything with total descendants < 170 except for neuron and epithelial cell when blueprint calls it as epithelial + dplyr::filter(total_descendants <= 170 | cl_annotation %in% c("neuron", "epithelial cell") & blueprint_annotation_main == "Epithelial cells") |> + # get rid of terms that have low number of descendants but are still too broad + dplyr::filter(!(cl_annotation %in% c("bone cell", "lining cell", "blood cell", "progenitor cell", "supporting cell"))) |> + dplyr::select( + panglao_ontology, + panglao_annotation, + blueprint_ontology, + blueprint_annotation_main, + blueprint_annotation_fine, + consensus_ontology = lca, + consensus_annotation = cl_annotation + ) + +# export table +readr::write_tsv(consensus_labels_df, consensus_ref_file) diff --git a/analyses/cell-type-consensus/scripts/README.md b/analyses/cell-type-consensus/scripts/README.md index 594e00a7d..e61b7e5cc 100644 --- a/analyses/cell-type-consensus/scripts/README.md +++ b/analyses/cell-type-consensus/scripts/README.md @@ -6,6 +6,10 @@ This folder contains all scripts used for generating consensus cell types. This reference file was originally obtained from `PanglaoDB` and contains a table with all marker genes for all cell types that were used to build the references used when running `CellAssign`. The file will be stored in `references/PanglaoDB_markers_2020-03-27.tsv`. -2. `01-prepare-cell-type-ontologies.sh`: This script is used to assign [cell type ontologies](https://www.ebi.ac.uk/ols4/ontologies/cl) to cell types in the `PanglaoDB` reference file. +2. `01-prepare-cell-type-ontologies.R`: This script is used to assign [cell type ontologies](https://www.ebi.ac.uk/ols4/ontologies/cl) to cell types in the `PanglaoDB` reference file. Any cell types whose human readable label matches the value in the `cell type` column of the reference file (downloaded using the `00-download-panglao-ref.sh` file) are programmatically assigned. Ontology terms and labels along with the `cell type` label from the reference file are saved to a new file, `references/panglao-cell-type-ontologies.tsv`. + +3. `02-prepare-consensus-reference.R`: This script is used to create a table with all consensus cell types. +The output table will contain one row for each combination of cell types in `PanglaoDB` and `BlueprintEncodeData` from `celldex` where a consensus cell type was identified. +If the combination is not included in the reference file, then no consensus cell type is assigned and can be set to "Unknown".