Skip to content

Moldia/Xenium_benchmarking

Repository files navigation

Xenium_benchmarking

This is a public repository to reproduce the analysis presented in the study available here:

Marco Salas et al. Optimizing Xenium In Situ data utility by quality assessment and best practice analysis workflows , 2024.

Abstract of the study

The Xenium In Situ platform is a new spatial transcriptomics product commercialized by 10X Genomics capable of mapping hundreds of genes in situ at a subcellular resolution. Given the multitude of commercially available spatial transcriptomics technologies, recommendations in choice of platform and analysis guidelines are increasingly important. Herein, we explore 25 Xenium datasets generated from multiple tissues and species comparing scalability, resolution, data quality, capacities and limitations with eight other spatially resolved transcriptomics technologies and commercial platforms. In addition, we benchmark the performance of multiple open source computational tools, when applied to Xenium datasets, in tasks including preprocessing, cell segmentation, selection of spatially variable features and domain identification. This study serves as the first independent analysis of the performance of Xenium, and provides best practices and recommendations for analysis of such datasets.


Datasets

The Xenium datasets used in though this study combined datasets provided by 10X Genomics, datasets published elsewhere and datasets generated specifically for this project. The original files can be found in:

To facilitate the reproducibility of our analysis, we also provided already pre-formated AnnData files for each dataset used in the study, available at various Zenodo repositories (https://doi.org/10.5281/zenodo.11124988 , https://doi.org/10.5281/zenodo.11121221 , https://doi.org/10.5281/zenodo.11120307 )

An example dataset (human spinal chord) used as an input for the end-to-end pipeline developed can be downloaded from https://doi.org/10.5281/zenodo.11120922


Folder structure

This repository includes several folders and files which serve specific purposes, including:

  • end-to-end_pipeline_optimized.ipynb: notebook including an end-to-end pipeline that can be used to run any new Xenium dataset desired -Notebooks: All the notebooks & R scripts used to reproduce the analysis presented in the manuscripts can be found here, organized by task
  • Data: This folder includes the place where the data used as an input or generated by the different notebooks is stored. Please note that intermediate outputs are not given in ths repo and should be generated from the original files
  • Figures: folder where the figures generated by the code included in "notebooks" is stored
  • xb: python scripts including all functions developed for this project
  • xenium_benchmarking.yml: yml file that can be used to generate the xenium benchmarking main environment, necessary to run most of the notebooks.
  • notebooks: folder including all notebooks required to reproduce the analysis presented in Marco Salasa et al. 2024. Subfolders are structured based on the topic of the analysis.

Cloning and adding

Please first clone the environment by running in the terminal:

git clone https://github.com/Moldia/Xenium_benchmarking.git

Navigate to the folder:

cd Xenium_benchmarking

create a conda environment using the provided .yml file by:

conda env create --name xb --file=xenium_benchmarking.yml

Acivate the conda environment by:

conda activate xb

And install using pip:

pip install -e .


Documentation

Extended documentation of the functions included in this repository as well as a the END-TO-END PIPELINE can be found in: https://xenium-benchmarking-test.readthedocs.io/en/latest/index.html