A comprehensive pipeline for analyzing T cell receptor (TCR) repertoire sequencing data, specifically optimized for human TCR-β chain analysis. This pipeline automates the workflow from raw FastQ files to clonotype identification using MiXCR.
- Features
- Prerequisites
- Installation
- Usage
- Pipeline Steps
- Output Structure
- Configuration
- Troubleshooting
- Automated processing of paired-end TCR-seq data
- Barcode-based sample demultiplexing
- TCR alignment using MiXCR
- Clonotype assembly and export
- Support for multiple samples
- Parallel processing capabilities
- Comprehensive logging
- Java 8+
- Python 2.7+
- MiXCR 2.0+
- FASTX-Toolkit 0.0.14+
- Reference databases:
- IMGT library (v201711-1 or later)
- Human TCR references
- Clone the repository:
git clone https://github.com/yourusername/TCRseq_Pipeline.git
cd TCRseq_Pipeline
- Ensure all required modules are available:
module load java/8u66
module load python fastx_toolkit/0.0.14
- Configure your project:
cp conf.txt.example conf.txt
# Edit conf.txt with your project-specific paths
- Prepare your sample barcode file:
# barcode.txt format
BARCODE1 sample1
BARCODE2 sample2
- Set up configuration (
conf.txt
):
myRawDATADIR="/path/to/raw/fastq/files"
myDATADIR="/path/to/processed/data"
myPROJDIR="/path/to/project"
myTCRScriptDIR="/path/to/scripts"
mySampleFile="barcode.txt"
- Run the pipeline:
./MixR_pipeline_human.sh
-
Sample Preparation
- Merge paired-end reads
- Remove random sequences
- Split samples by barcodes
-
Read Processing
- Separate reads into R1/R2
- Quality filtering
- Adapter trimming
-
TCR Analysis (MiXCR)
- Alignment to reference sequences
- Clonotype assembly
- Clone export and quantification
project_directory/
├── Analysis/
│ ├── align/ # MiXCR alignment files
│ │ └── sample_name/
│ │ ├── alignments.vdjca
│ │ └── alignmentReport.log
│ ├── assemble/ # Assembled clonotypes
│ │ └── sample_name/
│ │ ├── clones.clns
│ │ └── assembleReport.log
│ └── export/ # Final results
│ └── sample_name/
│ └── clones.txt
└── split_reads/ # Demultiplexed samples
--species hsa # Human species
--chains TRB # TCR beta chain
--library imgt.201711-1.s # IMGT library version
--OvParameters.geneFeatureToAlign=VRegion
# Default assembly parameters for optimal clonotype detection
--chains TRB # Export TCR beta chain results
Edit conf.txt
to specify:
# Required paths
myRawDATADIR="/path/to/raw/data" # Raw FastQ files
myDATADIR="/path/to/processed/data" # Processed data
myPROJDIR="/path/to/project" # Project directory
myTCRScriptDIR="/path/to/scripts" # Analysis scripts
mySampleFile="barcode.txt" # Sample barcodes
# Resource allocation
h_vmem="10G" # Memory per core
N_CPUS=6 # Number of CPU cores
- Paired-end reads
- Naming convention:
*R1_001.fastq.gz
,*R2_001.fastq.gz
AAGGTTCC patient1
CCTTAAGG patient2
-
Memory Issues
- Increase h_vmem in script header
- Process fewer samples in parallel
- Check Java heap settings
-
MiXCR Errors
- Verify IMGT library installation
- Check input FastQ format
- Validate species parameter
-
Barcode Splitting Issues
- Verify barcode format
- Check for contamination
- Adjust mismatch tolerance
MiXCR: command not found
- Check Java/MiXCR installationUnable to split samples
- Verify barcode file formatAlignment failed
- Check input file quality
-
Resource Management
- Adjust CPU allocation
- Optimize memory usage
- Monitor disk I/O
-
Processing Tips
- Split large batches
- Clean intermediate files
- Use SSD for temporary storage
This project is licensed under the MIT License - see the LICENSE file for details.
If you use this pipeline in your research, please cite:
Jadhav RR, Im SJ, Dixit PY, Tso Fan Yiu, Cao L, Sy MD, Lauer GM, Bernard NF, Wood C, Wilson P, Li C, Goronzy JJ. Loss of T cell progenitor reprogramming potential in aging bone marrow niches. JCI Insight. 2020 Apr 9;5(7):e134356. doi: 10.1172/jci.insight.134356. PMID: 32191644; PMCID: PMC7101137.
You can also cite this repository:
Jadhav R. (2025). TCRseq_Pipeline: A comprehensive pipeline for TCR repertoire analysis.
GitHub repository: https://github.com/rohitrrj/TCRseq_Pipeline
Contributions are welcome! Please read the contributing guidelines before submitting pull requests.
- MiXCR development team
- IMGT database maintainers
- Supporting institutions and funding