GitHub - aa9gj/Bone_proteogenomics_manuscript: Code accompanying the manuscript

Code accompanying the manuscript "Long read proteogenomics to connect disease-associated sQTLs to the protein isoform effectors in disease"

The full text can be found in Abood et al. 2024, AJGH

Purpose

We present a novel generalizable approach that integrates information from GWAS, splicing QTL (sQTL), and PacBio long-read RNA-seq in a disease relevant model to infer the effects of sQTLs on the ultimate protein isoform products they encode

Data availability

Processed and input data is found in
Raw long-read sequencing data is found in GSE224588

How to use this repository

Use setup_r_env.R to set up the R environment with all the needed packages.
The repo is broken down into three major sections:

sQTL_colocalization_analysis: This directory contains code needed to replicate Bayesian colocalization analysis with Coloc. Please refer to the README.md within directory for further information
- Step 0: Perform bayesian colocalization analysis using summary statistics from the latest BMD GWAS with summary statistics from sQTL data for all 49 GTEx tissues.
Reference_transcriptome_generation: This directory contains code to generate the reference transcriptome from long-read RNAseq data. Please refer to the README.md within directory for further information
- Isoseq analysis: from raw reads to isoform classification
- Step 1: Perform analyses on outputs from SQANTI and cDNA_cupcake
sQTL_to_isoform_mapping
- Step 2: Characterize full-length isoforms (known and novel) containing the colocalized junctions
- Step 3: Add effect size and direction of effect to colocalized junctions
- Step 4: Annotate lead sQTLs and their proxy, follow with positional and enrichment analyses
- Step 5: Differential analyses (DE and DIU) using tappAS
- Step 6: Integrating multiple datasets from the literature and within our analyses to prioritize the isoforms for experimental validation
- Step 7: ORF analyses including: NMD and truncation analysis was performed using a beta version of Biosurfer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code accompanying the manuscript "Long read proteogenomics to connect disease-associated sQTLs to the protein isoform effectors in disease"

Purpose

Data availability

How to use this repository

About

Releases 1

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 155 Commits
Reference_transcriptome_generation		Reference_transcriptome_generation
sQTL_colocalization_analysis		sQTL_colocalization_analysis
sQTL_to_isoform_mapping		sQTL_to_isoform_mapping
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
setup_r_env.R		setup_r_env.R

aa9gj/Bone_proteogenomics_manuscript

Folders and files

Latest commit

History

Repository files navigation

Code accompanying the manuscript "Long read proteogenomics to connect disease-associated sQTLs to the protein isoform effectors in disease"

Purpose

Data availability

How to use this repository

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages