-
Notifications
You must be signed in to change notification settings - Fork 12
FastOMA nextflow pipeline
Sina Majidian edited this page Oct 17, 2024
·
2 revisions
===========================================
FastOMA -- PIPELINE
===========================================
Usage:
Run the pipeline with default parameters:
nexflow run FastOMA.nf
Run with user parameters:
nextflow run FastOMA.nf --input_folder {input.dir} --output_folder {results.dir}
Mandatory arguments:
--input_folder Input data folder. Defaults to ${params.input_folder}. This folder
must contain the proteomes (in a subfolder named 'proteome') and
a species tree file. Optionally the folder might contain
- a sub-folder 'splice' containing splicing variant mappings
- a sub-folder 'hogmap_in' containing precomputed OMAmer
placement results for all proteomes
All sub-folders and sub-files can also be placed in orther
locations if you provide alternative values for them (see below on
optional arguments section).
--output_folder Path where all the output should be stored. Defaults to
${params.output_folder}
Profile selection:
-profile FastOMA can be run using several execution profiles. The default
set of available profiles is
- docker Run pipeline using docker containers. Docker needs
to be installed on your system. Containers will be
fetched automatically from dockerhub. See also
additional options '--container_version' and
'--container_name'.
- singlularity Run pipeline using singularity. Singularity needs
to be installed on your system. On HPC clusters,
it often needs to be loaded as a seperate module.
Containers will be fetched automatically from
dockerhub. See also additional options
'--container_version' and '--container_name'.
- conda Run pipeline in a conda environment. Conda needs
to be installed on your system. The environment
will be created automatically.
- standard Run pipeline on your local system. Mainly intended
for development purpose. All dependencies must be
installed in the calling environment.
- slurm_singularity
Run pipeline using SLURM job scheduler and
singularity containers. This profile can also be a
template for other HPC clusters that use different
schedulers.
- slurm_conda Run pipeline using SLURM job scheduler and conda
environment.
Profiles are defined in nextflow.config and can be extended or
adjusted according to your needs.
Additional options:
--proteome_folder Overwrite location of proteomes (default ${params.proteome_folder})
--species_tree Overwrite location of species tree file (newick format).
Defaults to ${params.species_tree}
--splice_folder Overwrite location of splice file folder. The splice files must be
named <proteome_file>.splice.
Defaults to ${params.splice_folder}
--omamer_db Path or URL to download the OMAmer database from.
Defaults to ${params.omamer_db}
--hogmap_in Optional path where precomputed omamer mapping files are located.
Defaults to ${params.hogmap_in}
--fasta_header_id_transformer
choice of transformers of input proteome fasta header
to reported IDs in output files (e.g. orthoxml files)
Defaults to '${params.fasta_header_id_transformer}', and can be set to
- noop : no transformation (input header == output header)
- UniProt : extract accession from uniprot header
e.g. '>sp|P68250|1433B_BOVIN' --> 'P68250'
Flags:
--help Display this message
--debug_enabled Store addtional information that might be helpful to debug in case
of a problem with FastOMA.
--report Produce nextflow report and timeline and store in in
$params.statdir
[9f/d2144c] check_input (1) | 1 of 1 :heavy_check_mark:
[b1/bd3de4] omamer_run (WHEAT.fa) | 20 of 20 :heavy_check_mark:
[1b/cdad85] infer_roothogs (1) | 1 of 1 :heavy_check_mark:
[02/013835] batch_roothogs (1) | 1 of 1 :heavy_check_mark:
[cd/612046] hog_big (26) | 31 of 31 :heavy_check_mark:
[63/8a10e0] hog_rest (165) | 175 of 175 :heavy_check_mark:
[0e/0410b3] collect_subhogs (1) | 1 of 1 :heavy_check_mark:
[d5/ab0b4b] ext…ise_ortholog_relations (1) | 1 of 1 :heavy_check_mark:
[e8/4b4628] fastoma_report (1) | 1 of 1 :heavy_check_mark: