Installation

System requirement:

System Memory Requirements:

Due to requirements of some of this program's dependencies, it is highly recommended that METABOLIC-C is run on a system containing at least 100 Gb of memory.
METABOLIC-G is not as demanding as METABOLIC-C and requires significantly less memory to run.

System Storage Requirements:

If you are planning to use only METABOLIC-G, you don't need to install GTDB-tk.

Necessary Databases	Approximate System Storage Required
METABOLIC program with unzipped files	7.69 Gb (including HMM database)
GTDB-Tk Reference Data	28 Gb

Dependencies overview:

Programs required:

Perl (>= v5.010)
HMMER (>= v3.1b2)
Prodigal (>= v2.6.3)
Sambamba (>= v0.7.0) (only for METABOLIG-C)
BAMtools (>= v2.4.0) (only for METABOLIG-C)
CoverM (only for METABOLIG-C)
R (>= 3.6.0)
Diamond
Samtools (only for METABOLIG-C)
Bowtie 2 (only for METABOLIG-C)
GTDB-Tk (only for METABOLIG-C)
gdown (for downloading METABOLIC_test_files.tgz)

Each of these programs should be in the PATH so that they can be accessed regardless of location.
Perl and R Dependencies Detailed Instructions:
Perl Modules:
To install, use the cpan shell by entering "perl -MCPAN -e shell cpan" and then entering
"install [Module Name]", or install by using "cpan -i [Module Name]", or by entering
"cpanm [Module Name]".

Example 1:
perl -MCPAN -e shell cpan
install Data::Dumper

Example 2:
cpan -i Data::Dumper

Example 3:
cpanm Data::Dumper

1. Data::Dumper
2. POSIX
3. Getopt::Long
4. Statistics::Descriptive
5. Array::Split
6. Bio::SeqIO
7. Bio::Perl
8. Bio::Tools::CodonTable
9. Carp
10. File::Spec
11. File::Basename
12. Parallel::ForkManager

R Packages:
To install, open the R command line interface by entering "R" into the command line, and then enter
"install.packages("[Package Name]")".

Example:
R
install.packages("diagram")
q()

1. diagram (v1.6.4)
2. forcats (v0.5.0)
3. digest (v0.6.25)
4. htmltools (v0.4.0)
5. rmarkdown (v2.1)
6. reprex (v0.3.0)
7. tidyverse (v1.3.0)
8. ggthemes (v4.2.0)
9. ggalluvial (v0.11.3)
10. reshape2 (v1.4.3)
11. ggraph (v2.0.2)
12. pdftools (v2.3)
13. igraph (v1.2.5)
15. tidygraph (v1.1.2)
16. stringr (v1.4.0)
17. plyr (v1.8.6)
18. dplyr (v0.8.5)
19. openxlsx (v4.1.4)

To ensure efficient and successful installation of METABOLIC, make sure that all dependencies are properly installed prior to download of the METABOLIC software.

Installation instructions:

Go to where you want the program to be and clone the github repository by using the following command:

git clone https://github.com/AnantharamanLab/METABOLIC.git

or click the green button "download ZIP" folder at the top of the github and unzip the downloaded file.
The perl and R scripts and dependent databases should be kept in the same directory.

NOTE: Before following the next step, make sure your working directory is the directory that was created by the METABOLIC download, that is, the directory containing the main scripts for METABOLIC (METABOLIC- G.pl, METABOLIC-C.pl, etc.).

NOTE: We created a script for easily setting up dependent databases (step 2-8)

We provide a "run_to_setup.sh" script along with the data downloaded from the GitHub for easy setup of dependent databases. This can be run by using the following command:

bash run_to_setup.sh

Once you've run the bash script, it is not necessary to run Step 2-8 described below.

METABOLIC requires the KofamKOALA hmm and METABOLIC hmm databases
KofamKOALA website

2.1. Download KofamKOALA hmm database files:

    mkdir kofam_database  
    cd kofam_database  
    wget -c ftp://ftp.genome.jp/pub/db/kofam/ko_list.gz  
    wget -c ftp://ftp.genome.jp/pub/db/kofam/profiles.tar.gz  
    gzip -d ko_list.gz  
    tar xzf profiles.tar.gz; rm profiles.tar.gz  
    mv ../All_Module_KO_ids.txt profiles  
    cd profiles  
    cp ../../Accessory_scripts/batch_hmmpress.pl ./  
    perl batch_hmmpress.pl

2.2. The METABOLIC hmm database in "METABOLIC_hmm_db.tgz" contains custom hmm files, self-parsed Pfam and TIRGfam files. It needs to be decompressed to the folder "METABOLIC_hmm_db" and stays in the same directory of KofamKOALA hmm database and scripts.

  tar zxvf METABOLIC_hmm_db.tgz

METABOLIC uses the "METABOLIC_template_and_database" which contains the hmm result table and KEGG database information. Decompress the METABOLIC_template_and_database.tgz to the folder "METABOLIC_template_and_database" and keep it in the same directory of KofamKOALA hmm database and scripts.

  tar zxvf METABOLIC_template_and_database.tgz

This software also contains "Accessory_scripts.gz", which needs to be decompressed before use.

  tar zxvf Accessory_scripts.tgz

This software also contains "Motif.tgz", which needs to be decompressed before use.

  tar zxvf Motif.tgz

You will download the most recent dbCAN-fam-HMMs.txt into a directory (that is made by you) “dbCAN2”. And parse the dbCAN-HMMdb.txt by "batch_hmmpress_for_dbCAN2_HMMdb.pl".

  mkdir dbCAN2
  cd dbCAN2
  wget http://bcb.unl.edu/dbCAN2/download/Databases/dbCAN-old@UGA/dbCAN-fam-HMMs.txt
  perl ../Accessory_scripts/batch_hmmpress_for_dbCAN2_HMMdb.pl
  cd ../

You will download the MEROPS Peptidase Protein Sequences (https://www.ebi.ac.uk/merops/download_list.shtml, No. 3 option). And parse the pepunit.lib by DIAMOND to make the BLASTP database.

  mkdir MEROPS
  cd MEROPS
  wget ftp://ftp.ebi.ac.uk/pub/databases/merops/current_release/pepunit.lib
  perl ../Accessory_scripts/make_pepunit_db.pl
  cd ../

Finally, this software also contains "METABOLIC_test_files.tgz ", which needs to be decompressed before use. This is a set of test genomes and reads that you can use to test run the program to see if it works correctly before running your real samples.

  tar zxvf METABOLIC_test_files.tgz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installation

System requirement:

Dependencies overview:

Installation instructions:

Clone this wiki locally