pybioinfo-utils

pybioinfo-utils is a collection of small Python functions designed for bioinformatics applications, particularly focused on protein sequence processing. These functions aim to simplify common tasks such as sequence manipulation, file format conversion, and sequence analysis.

Functions

`remove_gaps.py`

This script contains a function that removes dash characters ("-") from protein sequences in a FASTA file and writes the cleaned sequences to a new file.

remove_gaps(input_file, output_file)

`fasta_to_uppercase_and_dashes.py`

This script contains a function that converts lowercase characters to uppercase and replaces dots (".") with dashes ("-") in a protein sequence.

fasta_to_uppercase_and_dashes(input_file, output_file)

`aa_distribution.py`

This script contains a function that plots the amino acid distribution for each position in the Multiple Sequence Aignment of protein sequences from the given input FASTA file.

aa_distribution(fasta_file)

`convert_fasta_to_clustal.py`

This Python script contains two functions. One to convert a FASTA file to Clustal format and the other to convert a directory of FASTA files to Clustal format files.

fasta_to_clustal(input_fasta, output_clustal)

convert_directory(input_dir, output_dir)

`convert_seq_to_fasta.py`

This Python script contains two functions. One to convert a file in seq format to FASTA format and the other to convert a directory of seq files to FASTA files.

convert_seq_to_fasta(input_file, output_file)

batch_convert_seq_to_fasta(input_dir, output_dir)

`count_sequences_in_fasta.py`

This Python script contains a function that counts the number of sequences in a FASTA file.

sequence_count = count_sequences_in_fasta(fasta_file)

`find_consensus_without_gaps.py`

This Python script contains a function that calculates consensus sequences from multiple sequence alignments and writes them to an output file.

find_consensus(input_dir, output_file)

`make_seq_same_length.py`

This Python script contains a function that trims or pads each sequence in an input multiple sequence alignment to the maximum length.

make_seq_same_length(input_file, output_file)

`pfam2fasta.py`

This function converts PFAM alignment files to FASTA format.

pfam2fasta(input_file, output_file)

`remove_gaps_in_aln.py`

This Python script contains three functions. One to convert a given input multiple sequence alignment in FASTA format to a nested list format, another to remove columns in the multiple sequence alignment with high gap frequency, and finally write the new multiple sequence alignment to a FASTA file.

fasta_to_nested_list(input_file)

remove_columns_with_high_gap_frequency(nested_list, threshold)

write_to_fasta(output_file, data)

Usage

To use these functions, simply import them into your Python script or interactive session and provide the required input parameters. See individual function descriptions for usage examples.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pybioinfo-utils

Functions

`remove_gaps.py`

`fasta_to_uppercase_and_dashes.py`

`aa_distribution.py`

`convert_fasta_to_clustal.py`

`convert_seq_to_fasta.py`

`count_sequences_in_fasta.py`

`find_consensus_without_gaps.py`

`make_seq_same_length.py`

`pfam2fasta.py`

`remove_gaps_in_aln.py`

Usage

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
README.md		README.md
aa_distribution.py		aa_distribution.py
convert_fasta_to_clustal.py		convert_fasta_to_clustal.py
convert_seq_to_fasta.py		convert_seq_to_fasta.py
count_sequences_in_fasta.py		count_sequences_in_fasta.py
fasta_to_uppercase_and_dashes.py		fasta_to_uppercase_and_dashes.py
find_consensus_without_gaps.py		find_consensus_without_gaps.py
make_seq_same_length.py		make_seq_same_length.py
pfam2fasta.py		pfam2fasta.py
remove_gaps.py		remove_gaps.py
remove_gaps_in_aln.py		remove_gaps_in_aln.py

Venkatesh-99/pybioinfo-utils

Folders and files

Latest commit

History

Repository files navigation

pybioinfo-utils

Functions

remove_gaps.py

fasta_to_uppercase_and_dashes.py

aa_distribution.py

convert_fasta_to_clustal.py

convert_seq_to_fasta.py

count_sequences_in_fasta.py

find_consensus_without_gaps.py

make_seq_same_length.py

pfam2fasta.py

remove_gaps_in_aln.py

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`remove_gaps.py`

`fasta_to_uppercase_and_dashes.py`

`aa_distribution.py`

`convert_fasta_to_clustal.py`

`convert_seq_to_fasta.py`

`count_sequences_in_fasta.py`

`find_consensus_without_gaps.py`

`make_seq_same_length.py`

`pfam2fasta.py`

`remove_gaps_in_aln.py`

Packages