diff --git a/analyses/cell-type-consensus/exploratory-notebooks/01-reference-exploration.Rmd b/analyses/cell-type-consensus/exploratory-notebooks/01-reference-exploration.Rmd index 6701c5077..9a06b85cb 100644 --- a/analyses/cell-type-consensus/exploratory-notebooks/01-reference-exploration.Rmd +++ b/analyses/cell-type-consensus/exploratory-notebooks/01-reference-exploration.Rmd @@ -451,7 +451,7 @@ print_df |> ``` The PanglaoDB cell types seem to be more specific than the ones present in Blueprint Encode, similar to the observation with neurons. -We should keep epithelial cell. +We should keep epithelial cell in the cases where the Blueprint Encode annotation is `Epithelial cells` but not when it is `Keratinocytes`. ### Removing anything with more than 1 LCA @@ -605,7 +605,8 @@ I would use the following criteria to come up with my whitelist: - Pairs should not have more than 1 LCA, with the exception of the matches that have the label hematopoietic precursor cell. - The LCA should have equal to or less than 170 total descendants. -- We whould include the term for `neuron` and `epithelial cell` even though they do not pass the threshold for number of descendants. +- We should include the term for `neuron` and `epithelial cell` even though they do not pass the threshold for number of descendants. +However, `epithelial cell` should only be included if the Blueprint Encode name is `Epithelial cells` and _not_ `Keratinocytes`. - Terms that are too broad should be removed. This includes: `lining cell`, `blood cell`, `progenitor cell`, `bone cell`, and `supporting cell` diff --git a/analyses/cell-type-consensus/exploratory-notebooks/01-reference-exploration.html b/analyses/cell-type-consensus/exploratory-notebooks/01-reference-exploration.html index cd0998b3e..e2cae57ad 100644 --- a/analyses/cell-type-consensus/exploratory-notebooks/01-reference-exploration.html +++ b/analyses/cell-type-consensus/exploratory-notebooks/01-reference-exploration.html @@ -11,7 +11,7 @@ - +
## Rows: 178 Columns: 3
-## ── Column specification ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+## ── Column specification ──────────────────────────────────────────────────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (3): ontology_id, human_readable_value, panglao_cell_type
##
@@ -693,8 +693,8 @@ Latest common ancestor (LCA) between PanglaoDB and Blueprint
dplyr::mutate(lca = dplyr::if_else(blueprint_ontology == panglao_ontology, blueprint_ontology, lca)) |>
# join in information for each of the lca terms including name, number of ancestors and descendants
dplyr::left_join(cl_df, by = c("lca" = "cl_ontology"))
-## Warning: Expected 3 pieces. Missing pieces filled with `NA` in 7967 rows [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
-## 20, ...].
+## Warning: Expected 3 pieces. Missing pieces filled with `NA` in 7967 rows [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
+## 18, 19, 20, ...].
ggplot(lca_df, aes(x = total_ancestors)) +
@@ -1733,7 +1733,7 @@ Myeloid leukocyte
then I would argue we shouldn’t keep myeloid leukocyte. Noting that
after discussion we have decided to keep this one since T and B cells
are much easier to differentiate based on gene expression alone than
-cells that are party of the myeloid lineage.
+cells that are part of the myeloid lineage.
The PanglaoDB cell types seem to be more specific than the ones present in Blueprint Encode, similar to the observation with neurons. We -should keep epithelial cell.
+should keep epithelial cell in the cases where the Blueprint Encode +annotation isEpithelial cells
but not when it is
+Keratinocytes
.
## [1] "blood cell" "hematopoietic precursor cell" "lining cell" "perivascular cell"
-## [5] "supporting cell"
+## [1] "blood cell" "hematopoietic precursor cell" "lining cell"
+## [4] "perivascular cell" "supporting cell"
It looks like I am losing a few terms I already said were not specific and then a few other terms, like “hematopoietic precursor cell” and “perivascular cell”. I’ll look at both of those to confirm we would @@ -3909,9 +3911,12 @@
neuron
and
+neuron
and
epithelial cell
even though they do not pass the threshold
-for number of descendants.epithelial cell
should
+only be included if the Blueprint Encode name is
+Epithelial cells
and not
+Keratinocytes
.
lining cell
, blood cell
,
progenitor cell
, bone cell
, and