Skip to content

Sample Commands

ekherman edited this page May 28, 2020 · 4 revisions

Below are some sample commands for common snp_conversion tasks. Sample data are provided to run each of these commands within the test directory. In test, running the script ./test_script.sh will run the commands below with the sample data in test/input_data/conversion and test/input_data/concordance. The bash script will then move all output files to the directory test/test_outputand compare these files to files previously generated intest/sample_output`. Using these commands, messaging that is normally printed to the screen is piped to output files ending with "*_output.txt".

Determine the format of all files in the sample directory and retrieve SNP panel

python ../snp_conversion check_format --input-dir input_files/conversion \
--assembly UMD3_1_chromosomes --get-snp-panel --species bos_taurus \
 > test_output/check_input_files_output.txt

Check whether Illumina Forward format file is correctly formatted, specifying the key file directory. Output corresponding PED and MAP files

python ../snp_conversion check_format --input-dir input_files/conversion \
--file-list 50kv3_mFWD_14June2019.txt --plink --input-format FWD \
--assembly UMD3_1_chromosomes --conversion variant_position_files \
--species bos_taurus > test_output/check_forward_files_output.txt

Get a list of inconsistent markers in a file suspected to be in Top format, and write all output to a log file

python ../snp_conversion check_format --input-dir input_files/conversion \
--file-list 50kv3_mTOP_inconsistent.txt --input-format TOP \
--assembly UMD3_1_chromosomes --verbose --species bos_taurus \
 > test_output/check_inconsistent_files_output.txt

Output a tab-formatted summary file after checking the format of a Long-format file

python ../snp_conversion check_format --input-dir input_files/conversion \
--file-list 50kv3_Long_14June2019.txt --input-format LONG \
--assembly UMD3_1_chromosomes --summary --tabular --species bos_taurus \
> test_output/check_long_format_file_output.txt

Convert an Illumina matrix file in Top format to a Long format file without specifying an output suffix

python ../snp_conversion convert_file --input-dir input_files/conversion \
--file-list 50kv3_mTOP_14June2019.txt --input-format TOP \
--output-format LONG --assembly UMD3_1_chromosomes --species bos_taurus \
> test_output/convert_top_to_long_output.txt

Convert a list of files of unknown or mixed formats to Forward format, specifying the output suffix 'FORWARD'

python ../snp_conversion convert_file --input-dir input_files/conversion \
--file-list 50kv3_Long_14June2019.txt,50kv3_mTOP_14June2019.txt --output-format FWD \
--output-name FORWARD --assembly UMD3_1_chromosomes --species bos_taurus \
> test_output/convert_mixed_to_forward_output.txt

Convert a file from Affymetrix (native) to Affymetrix Plus format, specifying the number of threads and output suffix 'affy_plus'

python ../snp_conversion convert_file --input-dir input_files/conversion \
--file-list G_CCGP_Axiom_sample.txt --input-format AFFY \
--output-format AFFY-PLUS --output-name affy_plus --assembly UMD3_1_chromosomes \
--species bos_taurus --threads 2 > test_output/convert_affy_native_to_plus_output.txt

Merge a list of files in Forward format and output the file 'merged_forward_files.txt'

python ../snp_conversion merge_files --input-dir input_files/conversion \
--file-list 50kv3_mFWD_part1.txt,50kv3_mFWD_part2.txt --input-format FWD \
--output merged_forward_files.txt > test_output/merge_forward_files_output.txt

Concordance analysis between an Illumina LONG file and a VCF file, with tabular output

python ../snp_conversion genotype_concordance  \
--snp-panel input_files/concordance/G_CCGP_long_sample_input.txt \
--panel-type LONG --vcf input_files/concordance/SNPs_reduced_anon.vcf.gz \
--species bos_taurus --assembly ARS-UCD1_2_Btau5_0_1Y --output-type tabular \
--output concordance_test1 > test_output/long_vs_vcf_tab_concordance.txt

Concordance analysis between an Illumina LONG file and a VCF file, filtering on quality values, with pretty output

python ../snp_conversion genotype_concordance  \
--snp-panel input_files/concordance/G_CCGP_long_sample_input.txt \
--panel-type LONG --vcf input_files/concordance/SNPs_reduced_anon.vcf.gz \
--species bos_taurus --assembly ARS-UCD1_2_Btau5_0_1Y --filter-vcf --qual 100 \
--output-type pretty --output concordance_q100_test2 \
> test_output/long_vs_vcf_q100_pretty_concordance.txt

Concordance analysis between an Affymetrix file and a VCF file, outputting a list of discordant positions

python ../snp_conversion genotype_concordance  \
--snp-panel input_files/concordance/G_CCGP_affy_short_input.txt \
--panel-type AFFY --vcf input_files/concordance/SNPs_reduced_anon.vcf.gz \
--species bos_taurus --assembly ARS-UCD1_2_Btau5_0_1Y --extract-discordant \
--output concordance_affy > test_output/affy_vs_vcf_concordance.txt

Generate a VCF file from a SNP panel file

python ../snp_conversion vcf_generator \
--snp-panel input_files/concordance/G_CCGP_affy_short_input.txt \
--species bos_taurus --assembly ARS-UCD1_2_Btau5_0_1Y \
--panel-type AFFY > test_output/snp_to_vcf.txt
Clone this wiki locally