Skip to content

DNA Reference Creator: this pipeline is aimed at downloading and building indexes and accessory files for reference genomes

License

Notifications You must be signed in to change notification settings

solida-core/toolkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SOLIDA-CORE TOOLKIT

Collection of solida-core useful scripts for accessory and supplementary files management.

Included Scripts:

Download TOOLKIT:

TOOLKIT is a Git repository, to download it type:

git clone https://github.com/solida-core/toolkit.git
cd toolkit

##Script MANUALS:

reference_organizer.py

This script performs the organization of reference files in solida-core required directory-tree structure. The user is required to entry the desired reference folder path and to choose between hg19 and hg38 human genome versions.

The scripts attempt to connect and download files from the FTP server of GATK resource bundle. Files are then extracted and placed in the solida-core expected directory structure.

Given the limit of 25 users of FTP server, the script performs multiple connection attempts [default=5]. This value can be set with the --reconnection_attempts parameter.

To get script usage, type:

python reference_organizer.py -h
usage: reference_organizer.py [-h] --reference_dir PATH --release hg19/hg38
                              [--reconnection_attempts int] [--force]

Prepare reference files for solida-core pipelines

optional arguments:
  -h, --help            show this help message and exit
  --reference_dir PATH, -w PATH
                        Destination folder for reference files
  --release hg19/hg38, -r hg19/hg38
                        UCSC Genome Release to download: [hg19,hg38]
  --reconnection_attempts int, -a int
                        Number of connection attempts to perform in case of
                        busy FTP server [default: 5]
  --force               Download files in the directory even if they exists
                        (Default: FALSE)

About

DNA Reference Creator: this pipeline is aimed at downloading and building indexes and accessory files for reference genomes

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published