Skip to content

PlantTribes is a collection of automated gene family analysis pipelines for comparative plant genomics

License

Notifications You must be signed in to change notification settings

AG-Run/PlantTribes

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PlantTribes

Overview

PlantTribes is a collection of automated modular analysis pipelines that utilize objective classifications of complete protein sequences from sequenced plant genomes to perform comparative evolutionary studies. It post-processes de novo assembly transcripts into putative coding sequences and their corresponding amino acid translations, locally assembles targeted gene families, estimates paralogous/orthologous pairwise synonymous/non-synonymous substitution rates for a set of gene sequences, classifies gene sequences into pre-computed orthologous plant gene family clusters, and builds gene family multiple sequence alignments and their corresponding phylogenies.

Please submit all questions, inquires, and bugs using the PlantTribes repository issues tab.

In addition to this README file, you can consult the PlantTribes manual for more detailed information.

Installation

PlantTribes pipeline scripts have many external dependencies that need to be installed and available on the environment's $PATH before the pipelines can be used.

Pipelines dependencies

PlantTribes scaffolds datasets

PlantTribes gene family scaffolds download website

Install PlantTribes

  1. Open a terminal and change to the location where you would to keep PlantTribes.
  • Example: cd ~/softwares
  1. Clone the PlantTribes GitHub repository or download the zip archive and decompress it in your desired location.
  • Examples: git clone https://github.com/dePamphilis/PlantTribes.git or unzip https://github.com/dePamphilis/PlantTribes/archive/master.zip
  1. Download the scaffold data set(s) that you would like to use into the PlantTribes' data subdirectory and decompress them.
  • Examples: cd PlantTribes/data, md5sum 22Gv1.1.tar.bz (should match the provided MD5 checksum for the data archive), followed by tar -xjvf 22Gv1.1.tar.bz2

Using PlantTribes

The execulables for the PlantTribes pipelines are in the pipelines subdrectory of the installation. You can either add them to your PATH environment variable or execute directly from the PlantTribes installation.

  • AssemblyPostProcessor pipeline:
    • Display all usage options:
      • PlantTribes/pipelines/AssemblyPostProcesser
    • Basic run using ESTScan prediction method:
      • PlantTribes/pipelines/AssemblyPostProcesser --transcripts transcripts.fasta --prediction_method estscan --score_matrices /path/to/score/matrices/Arabidopsis_thaliana.smat
  • GeneFamilyClassifier pipeline:
    • Display all usage options:
      • PlantTribes/pipelines/GeneFamilyClassifier
    • Basic run using 22Gv1.1 scaffolds, orthomcl clustering method, and blastp classifier:
      • PlantTribes/pipelines/GeneFamilyClassifier --proteins proteins.fasta --scaffold 22Gv1.1 --method orthomcl --classifier blastp
  • PhylogenomicsAnalysis pipeline:
    • Display all usage options:
      • PlantTribes/pipelines/PhylogenomicsAnalysis
    • Basic run using 22Gv1.1 scaffolds, orthomcl clustering method, and raxml Phylogenetic trees inference method:
      • PlantTribes/pipelines/PhylogenomicsAnalysis --orthogroup_faa geneFamilyClassification_dir/orthogroups_fasta --scaffold 22Gv1.1 --method orthomcl --add_alignments --tree_inference raxml
  • GeneFamilyIntegrator:
    • Display all usage options:
      • PlantTribes/pipelines/GeneFamilyIntegrator
    • Basic run using 22Gv1.1 scaffolds, orthomcl clustering method:
      • GeneFamilyIntegrator --orthogroup_faa geneFamilyClassification_dir/orthogroups_fasta --scaffold 22Gv1.1 --method orthomcl
  • GeneFamilyAligner pipeline:
    • Display all usage options:
      • PlantTribes/pipelines/GeneFamilyAligner
    • Basic run using 22Gv1.1 scaffolds, orthomcl clustering method, and mafft alignment method:
      • GeneFamilyAligner --orthogroup_faa integratedGeneFamilies_dir --alignment_method mafft
  • GeneFamilyPhylogenyBuilder pipeline:
    • Display all usage options:
      • PlantTribes/pipelines/GeneFamilyPhylogenyBuilder
    • Basic run using 22Gv1.1 scaffolds, orthomcl clustering method, and fastree Phylogenetic trees inference method:
      • GeneFamilyPhylogenyBuilder --orthogroup_aln geneFamilyAlignments_dir/orthogroups_aln --tree_inference fasttree
  • KaKsAnalysis pipeline
    • Display all usage options:
      • PlantTribes/pipelines/KaKsAnalysis
    • Basic run using for paralogous analysis:
      • KaKsAnalysis --coding_sequences_species_1 species1.fna --proteins_species_1 species1.faa --comparison paralogs --num_threads 4

Please consult the PlantTribes manual and tutorial for a detailed description and usage of all options for the pipelines respectively.

License

PlantTribes is distributed under the GNU GPL v3. For more information, see license.

About

PlantTribes is a collection of automated gene family analysis pipelines for comparative plant genomics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Perl 96.0%
  • R 4.0%