Using multiplexed imaging, TYPEx detects protein expression on single cells, annotates cell types automatically based on user-provided definitions and quantifies cell densities per tissue area. It can be customised with input parameters and configuration files, allowing it to perform an end-to-end cell phenotyping analysis without the need for manual adjustments.
First, clone the TYPEX or the TRACERx-PHLEX repository:
git clone [email protected]:FrancisCrickInstitute/TRACERx-PHLEX.git git clone [email protected]:FrancisCrickInstitute/TYPEx.git
nextflow run TRACERx-PHLEX/TYPEx/main.nf \
-c $PWD/TRACERx-PHLEX/TYPEx/conf/testdata.config \
--input_dir $PWD/results/deep-imcyto/$release/ \
--sample_file $PWD/TRACERx-PHLEX/TYPEx/data/sample_file.tracerx.txt \
--release $release \
--outDir "$PWD/results/TYPEx/$release/" \
--params_config "$PWD/TRACERx-PHLEX/TYPEx/data/typing_params.json" \
--annotation_config "$PWD/TRACERx-PHLEX/TYPEx/data/cell_type_annotation.json" \
--tissue_seg_model "$PWD/TRACERx-PHLEX/TYPEx/models/tumour_stroma_classifier.ilp" \
--color_config $PWD/TRACERx-PHLEX/TYPEx/data/celltype_colors.json \
--deep_imcyto true --cellprofiler true \
-profile singularity \
-resume
release=TYPEx_test
nextflow run TYPEx/main.nf \
-c $PWD/TYPEx/test.config \
-c TYPEx/testdata.config \
--input_dir $PWD/results/ \
--release $release \
--input_table $PWD/TYPEx/data/cell_objects.tracerx.txt \
--sample_file $PWD/TYPEx/data/sample_file.tracerx.txt \
--outDir "$PWD/results/TYPEx/$release/" \
--params_config "$PWD/TYPEx/data/typing_params.json" \
--annotation_config "$PWD/TYPEx/data/cell_type_annotation.json" \
--color_config $PWD/TYPEx/data/celltype_colors.json \
-profile singularity \
-resume
Running locally without high-perfomance computing server
release=TYPEx_test
nextflow run TYPEx/main.nf \
-c $PWD/TYPEx/conf/testdata.config \
-c TYPEx/testdata.config \
--input_dir $PWD/results/ \
--release $release \
--input_table $PWD/TYPEx/data/cell_objects.tracerx.txt \
--sample_file $PWD/TYPEx/data/sample_file.tracerx.txt \
--outDir "$PWD/results/TYPEx/$release/" \
--params_config "$PWD/TYPEx/conf/typing_params.json" \
--annotation_config "$PWD/TYPEx/data/cell_type_annotation.json" \
--color_config $PWD/TYPEx/data/celltype_colors.json \
-profile docker \
-resume
Required Inputs
cell_type_annotation.json
- a file with cell definitions specific to the user’s antibody panel (see :ref:`Cell type definitions`).- Specified with
--annotationConfig
parameter.
sample_data.tracerx.txt
- A tab-delimited file with information for all images (see :ref:`Sample annotation table`).
Specified with
--sampleFile
parameter.
inDir
for deep-imcyto input orinputTable
for runs independent of deep-imcyto- Directory specified with
--inDir
parameter and input file specified with--inputTable
parameter.--inputTable
is tab-delimited file with marker intensities and cell coordiate per cell object (see :ref:`Input table`).
Optional Inputs
typing_params.json
- a config file with information on the cell typing workflow.- A tab-delimited file with information for all images (see :ref:`Typing parameters config`).
Specified with
--paramsConfig
parameter.
tissue_segmentation.json
- a file with information on tissue categories/annotation that can be overlaid to each cell object along with the cell type information. In the case of tissue compartments, e.g. Tumour and Stroma, a summary table will also be generated with quantifications per compartment.- Specified with
--overlayConfigFile
parameter.
celltype_colors.json
- color settings for the user-defined cell types.- Specified with
--colorConfig
parameter.
release
- provide a unique identifier for the run [default: PHLEX_test]
panel
- provide a unique identifier for the panel [default: p1]
study
- provide a unique identifier for the study [default: tracerx]
Several input paramters can be used to define the typing workflow:
- deep-imcyto
run the TYPEx multi-tiered approach [default: true]
- cellprofiler
run TYPEx on deep-imcyto in MCCS mode when true and simple segmentation mode when false [default: true]
tiered
run the TYPEx multi-tiered approach [default: true]stratify_by_confidence
include the stratification by low and high confidence when true [default: true]sampled
run TYPEx on subsampled data with three iterations when true [default: false]clustered
perform clustering without any stratification [default: false]
The following parameters refer to the typing approach:
- subtype_method
the clustering approach to be used in the last stratification step [default: FastPG]
- major_markers
the label of the major cell type definitions in cell_type_annotation.json
[default: major_markers]
- subtype_markers
the label of the cell subtype definitions in cell_type_annotation.json
[default: subtype_markers]
- mostFreqCellType
the most frequent cell type in the cohort if known in cell_type_annotation.json
[default: None]
Note
The most frequent cell type is used to build the reference model by excluding this cell type. When it is not provided, the complete model wil be built, followed by the reference model. If provided, both will be executed in parallel. Parallel execution can make a difference in time, as these are the most time-consuming processes.
The cell-type definitions file cell_type_annotation.json
includes a list of cell lineages and the corresponding marker proteins that together can be used to identify a cell lineage. When designing this file it is important to ensure that each cell in the cohort can be covered by these definitions. Some markers, such as CD45 and Vimentin, are expressed by multiple cell lineages. These shared proteins are used to infer a hierarchy of cell lineages, which is later considered for cell stratification and annotation. An example of a cell-type definitions file is shown below for TRACERx analyses, where we defined 13 major cell types targeted by our two antibody panels, while ensuring that each cell in the cohort can be covered by these definitions.
The input matrix has values that summarise the intensity of a protein per cell object, such as mean intensity, independently of the imaging modality or antibody tagging technique.
ObjectNumber | imagename | X | Y | Area | <Marker 1> | ... | <Marker N> |
typing_params.json
contains the settings for clustering approaches to be used, normalisation approaches, and filtering criteria.
Key parameters that are often of interest are: * magnitude As CellAssign was developed for single-cell sequencing read count data, the input protein intensity matrix should be rescaled to a range of 0 - 10^6 using the input parameter magnitude.
- batch_effects
CellAssign also accounts for batch effects, which can be considered if provided in a sample-annotation table and specified as input parameters to TYPEx for batch correction.
Provide the sample annotation table in the following format:
imagename | <experimental condition> | <Batch effect 1> | ... | <Batch effect N> | use_image |
TYPEx outputs summary tables that can be readily interrogated for biological questions. These include densities of identified cell phenotypes (cell_density_*.txt), a catalogue of the expressed proteins and combinations thereof (phenotypes.*.txt), quantified across the whole tissue area (summary_*.cell_stats.txt) or within each tissue compartment (categs_summary_*.cell_stats.txt).
summary
├── cell_density_*.txt
├── cell_objects_*.txt
├── phenotypes.*.txt
├── summary_*.cell_stats.txt
├── categs_summary_*.cell_stats.txt
├── maps
├── intensity_plots
├── overlays
Several visualisation plots are output for each step in the workflow and can be used to make sure each step has gone as expected.