Skip to content

FastOMA nextflow pipeline

Sina Majidian edited this page Oct 17, 2024 · 2 revisions
    ===========================================
      FastOMA -- PIPELINE
    ===========================================
    Usage:
    Run the pipeline with default parameters:
    nexflow run FastOMA.nf

    Run with user parameters:
    nextflow run FastOMA.nf --input_folder {input.dir}  --output_folder {results.dir}

    Mandatory arguments:
        --input_folder          Input data folder. Defaults to ${params.input_folder}. This folder
                                must contain the proteomes (in a subfolder named 'proteome') and
                                a species tree file. Optionally the folder might contain
                                 - a sub-folder 'splice' containing splicing variant mappings
                                 - a sub-folder 'hogmap_in' containing precomputed OMAmer
                                   placement results for all proteomes

                                All sub-folders and sub-files can also be placed in orther
                                locations if you provide alternative values for them (see below on
                                optional arguments section).

        --output_folder         Path where all the output should be stored. Defaults to
                                ${params.output_folder}


    Profile selection:
        -profile                FastOMA can be run using several execution profiles. The default
                                set of available profiles is
                                 - docker       Run pipeline using docker containers. Docker needs
                                                to be installed on your system. Containers will be
                                                fetched automatically from dockerhub. See also
                                                additional options '--container_version' and
                                                '--container_name'.

                                 - singlularity Run pipeline using singularity. Singularity needs
                                                to be installed on your system. On HPC clusters,
                                                it often needs to be loaded as a seperate module.
                                                Containers will be fetched automatically from
                                                dockerhub. See also additional options
                                                '--container_version' and '--container_name'.

                                 - conda        Run pipeline in a conda environment. Conda needs
                                                to be installed on your system. The environment
                                                will be created automatically.

                                 - standard     Run pipeline on your local system. Mainly intended
                                                for development purpose. All dependencies must be
                                                installed in the calling environment.

                                 - slurm_singularity
                                                Run pipeline using SLURM job scheduler and
                                                singularity containers. This profile can also be a
                                                template for other HPC clusters that use different
                                                schedulers.

                                 - slurm_conda  Run pipeline using SLURM job scheduler and conda
                                                environment.

                                Profiles are defined in nextflow.config and can be extended or
                                adjusted according to your needs.


    Additional options:
        --proteome_folder       Overwrite location of proteomes (default ${params.proteome_folder})
        --species_tree          Overwrite location of species tree file (newick format).
                                Defaults to ${params.species_tree}
        --splice_folder         Overwrite location of splice file folder. The splice files must be
                                named <proteome_file>.splice.
                                Defaults to ${params.splice_folder}
        --omamer_db             Path or URL to download the OMAmer database from.
                                Defaults to ${params.omamer_db}
        --hogmap_in             Optional path where precomputed omamer mapping files are located.
                                Defaults to ${params.hogmap_in}
        --fasta_header_id_transformer
                                choice of transformers of input proteome fasta header
                                to reported IDs in output files (e.g. orthoxml files)
                                Defaults to '${params.fasta_header_id_transformer}', and can be set to
                                  - noop         : no transformation (input header == output header)
                                  - UniProt      : extract accession from uniprot header
                                                   e.g. '>sp|P68250|1433B_BOVIN' --> 'P68250'

    Flags:
        --help                  Display this message
        --debug_enabled         Store addtional information that might be helpful to debug in case
                                of a problem with FastOMA.
        --report                Produce nextflow report and timeline and store in in
                                $params.statdir



example output


[9f/d2144c] check_input (1)                | 1 of 1 :heavy_check_mark:
[b1/bd3de4] omamer_run (WHEAT.fa)          | 20 of 20 :heavy_check_mark:
[1b/cdad85] infer_roothogs (1)             | 1 of 1 :heavy_check_mark:
[02/013835] batch_roothogs (1)             | 1 of 1 :heavy_check_mark:
[cd/612046] hog_big (26)                   | 31 of 31 :heavy_check_mark:
[63/8a10e0] hog_rest (165)                 | 175 of 175 :heavy_check_mark:
[0e/0410b3] collect_subhogs (1)            | 1 of 1 :heavy_check_mark:
[d5/ab0b4b] ext…ise_ortholog_relations (1) | 1 of 1 :heavy_check_mark:
[e8/4b4628] fastoma_report (1)             | 1 of 1 :heavy_check_mark:

Dependency graph of FastOMA pipeline

DAG_fastOMA