Stores files and codes require to build contact string of alpha helices using chain-helix.c
Read me file for contact strin building and analysis
Date : 14- April,2014
Get fasta file of filtered PDB chains from rcsb with required structure quality.
Run Cd-Hit to cluster them with sequence identity cut-off with following code: see readme_folder/cdhitOutfile
Get cluser representative and run seondary structure module on the same. run python code <cdhit_output file> outfile has three tag lines : sP: Percentage of secondary structure sD: Residue detail of secondary structure sN: Number of secondary structure in each protein chain $ less sseDetail* | grep 'sN' > sse_number*
Get pdb chains of required number of alpha helix content. min 3 max 10.. Can be done by awk: awk '{if ($6 >=3 && $6 <=10) print $0}' sse_number_13_4_14 | wc -l $1047 awk '{if ($6 >=3 && $6 <=10) print $1$2}' sse_number_13_4_14
Get statistics of data use , see 1. get_dataset_statistics.ipynb
Run helix interaction code to get sse contact patter. %%% Use c-helix which analyze contact on the basis of protein chains. The %nameing of secondary structure (serial helix number) can be obtain in %following step.
Save output of the code helix_shape.txt and helix_packing_pair.txt in flder as these are important results.
run $ python <helix_Shape.txt> <helix_packing_pair.txt> > OnlyInteraction. REMOVE the Last line $python helix_shape.txt helix_packing_pair.txt > InteractionOnly
Save Tab separted file of helix_shape.txt and into file helSpaceTab; Can be done by: awk '{print $1"\t"$2"\t"$3"\t"$4}' helix_shape.txt > helSpaceTab the bash script which in turn runs python script to get pattern: bash extrModiPat.bash InteractionOnly
Out put pattern file will be saved in IntPatterns_ContactNear_added.txt
Example is in /home/taushif/alphaWork/compareDatasets/HppChain @ Use previously used c-helix code of chothia @ get contact py python code in-house
Identify presence of sub-patterns and uniq patters per alpha helix content