PlantTribes is a collection of automated modular analysis pipelines that utilize objective classifications of complete protein sequences from sequenced plant genomes to perform comparative evolutionary studies. It post-processes de novo assembly transcripts into putative coding sequences and their corresponding amino acid translations, locally assembles targeted gene families, estimates paralogous/orthologous pairwise synonymous/non-synonymous substitution rates for a set of gene sequences, classifies gene sequences into pre-computed orthologous plant gene family clusters, and builds gene family multiple sequence alignments and their corresponding phylogenies.
Please submit all questions, inquires, and bugs using the PlantTribes repository issues tab.
In addition to this README file, you can consult the PlantTribes manual for more detailed information.
PlantTribes pipeline scripts have many external dependencies that need to be installed and available on the environment's $PATH before the pipelines can be used.
- AssemblyPostProcessor pipeline:
ESTScan (version 2.1), TransDecoder (version 3.0.1), HMMSearch (HMMER version 3.1b1),
MAFFT (version 7.215 ), trimAl (version 1.4.rev8), and GenomeTools (version 1.5.4). - GeneFamilyClassifier pipeline:
BLASTP (NCBI BLAST version 2.2.29+), and HMMScan (HMMER version 3.1b1). - PhylogenomicsAnalysis pipeline (legacy pipeline):
MAFFT (version 7.215 ), PASTA, trimAl (version 1.4.rev8), RAxML (version 8.1.16),
and FastTreeMP (version 2.1.7 SSE3). - GeneFamilyIntegrator:
No external dependencies required. - GeneFamilyAligner pipeline:
MAFFT (version 7.215 ), PASTA, and trimAl (version 1.4.rev8). - GeneFamilyPhylogenyBuilder pipeline:
RAxML (version 8.1.16), and FastTreeMP (version 2.1.7 SSE3). - KaKsAnalysis pipeline:
MAKEBLASTDB/BLASTN (NCBI BLAST version 2.2.29+), CRB-BLAST (version 0.6.9), MAFFT (version 7.215 ), PAML (version 4.8),
and EMMIX (version 1.3).
PlantTribes gene family scaffolds download website
- Open a terminal and change to the location where you would to keep PlantTribes.
- Example:
cd ~/softwares
- Clone the PlantTribes GitHub repository or download the zip archive and decompress it in your desired location.
- Examples:
git clone https://github.com/dePamphilis/PlantTribes.git
orunzip https://github.com/dePamphilis/PlantTribes/archive/master.zip
- Download the scaffold data set(s) that you would like to use into the PlantTribes' data subdirectory and decompress them.
- Examples:
cd PlantTribes/data
,md5sum 22Gv1.1.tar.bz
(should match the provided MD5 checksum for the data archive), followed bytar -xjvf 22Gv1.1.tar.bz2
The execulables for the PlantTribes pipelines are in the pipelines subdrectory of the installation. You can either add them to your PATH environment variable or execute directly from the PlantTribes installation.
- AssemblyPostProcessor pipeline:
- Display all usage options:
PlantTribes/pipelines/AssemblyPostProcesser
- Basic run using ESTScan prediction method:
PlantTribes/pipelines/AssemblyPostProcesser --transcripts transcripts.fasta --prediction_method estscan --score_matrices /path/to/score/matrices/Arabidopsis_thaliana.smat
- Display all usage options:
- GeneFamilyClassifier pipeline:
- Display all usage options:
PlantTribes/pipelines/GeneFamilyClassifier
- Basic run using 22Gv1.1 scaffolds, orthomcl clustering method, and blastp classifier:
PlantTribes/pipelines/GeneFamilyClassifier --proteins proteins.fasta --scaffold 22Gv1.1 --method orthomcl --classifier blastp
- Display all usage options:
- PhylogenomicsAnalysis pipeline:
- Display all usage options:
PlantTribes/pipelines/PhylogenomicsAnalysis
- Basic run using 22Gv1.1 scaffolds, orthomcl clustering method, and raxml Phylogenetic trees inference method:
PlantTribes/pipelines/PhylogenomicsAnalysis --orthogroup_faa geneFamilyClassification_dir/orthogroups_fasta --scaffold 22Gv1.1 --method orthomcl --add_alignments --tree_inference raxml
- Display all usage options:
- GeneFamilyIntegrator:
- Display all usage options:
PlantTribes/pipelines/GeneFamilyIntegrator
- Basic run using 22Gv1.1 scaffolds, orthomcl clustering method:
GeneFamilyIntegrator --orthogroup_faa geneFamilyClassification_dir/orthogroups_fasta --scaffold 22Gv1.1 --method orthomcl
- Display all usage options:
- GeneFamilyAligner pipeline:
- Display all usage options:
PlantTribes/pipelines/GeneFamilyAligner
- Basic run using 22Gv1.1 scaffolds, orthomcl clustering method, and mafft alignment method:
GeneFamilyAligner --orthogroup_faa integratedGeneFamilies_dir --alignment_method mafft
- Display all usage options:
- GeneFamilyPhylogenyBuilder pipeline:
- Display all usage options:
PlantTribes/pipelines/GeneFamilyPhylogenyBuilder
- Basic run using 22Gv1.1 scaffolds, orthomcl clustering method, and fastree Phylogenetic trees inference method:
GeneFamilyPhylogenyBuilder --orthogroup_aln geneFamilyAlignments_dir/orthogroups_aln --tree_inference fasttree
- Display all usage options:
- KaKsAnalysis pipeline
- Display all usage options:
PlantTribes/pipelines/KaKsAnalysis
- Basic run using for paralogous analysis:
KaKsAnalysis --coding_sequences_species_1 species1.fna --proteins_species_1 species1.faa --comparison paralogs --num_threads 4
- Display all usage options:
Please consult the PlantTribes manual and tutorial for a detailed description and usage of all options for the pipelines respectively.
PlantTribes is distributed under the GNU GPL v3. For more information, see license.