this repo contains multiple nextflow workflows for somatic NGS analysis.
we focus on modularity and simplicity.
the implementation is inspired by various nf-core modules and the nf-core/sarek pipeline.
there are multiple ways of running the workflow. the simplest way is to run (portable) from a single folder, but WARNING this pipeline requires a few terabytes of free storage.
for now we recommend running on a single host with conda.
Dependencies:
- nextflow
- conda
git clone --recursive https://github.com/feiloo/nextflow_workflows.git
mkdir reference nextflow_calldir nextflow_workdir nextflow_outputdir cache
export NEXTFLOW_MODULES="$(pwd)/nextflow_workflows/modules"
export NGS_REFERENCE_DIR="$(pwd)/reference"
export NEXTFLOW_CALLDIR="$(pwd)/nextflow_calldir"
export NEXTFLOW_WORKDIR_CUSTOM="$(pwd)/nextflow_workdir"
export NEXTFLOW_OUTPUTDIR_CUSTOM="$(pwd)/nextflow_outputdir"
export NEXTFLOW_STOREDIR="$(pwd)/cache"
# requires boto3
python scripts/download_data.py reference $NGS_REFERENCE_DIR
the pipeline has multiple variations, the main variation is "align_interpret"
it requires a samplesheet.csv like this:
sample_id,normal_read1,normal_read2,tumor_read1,tumor_read2
SAMPLENAME-24,/data/testdata/SAMPLENAME-24_N_1.fq.gz,/data/testdata/SAMPLENAME-24_N_2.fq.gz,/data/testdata/SAMPLENAME-24_T_1.fq.gz,/data/testdata/SAMPLENAME-24_T_2.fq.gz
now start the pipeline.
cd $NEXTFLOW_CALLDIR && nextflow run $NEXTFLOW_MODULES/ukb_main_workflow/main.nf \
-profile standard \
--workflow_variation align_interpret \
--samplesheet test_sequence_alignment_samplesheet.csv
see the environment variables and nextflow-configs like modules/ukb_main_workflow/user.config
for development, you can set the NEXTFLOW_MODULES variable to this repos modules directory:
export NEXTFLOW_MODULES="$(pwd)/modules"
development dependencies:
- meson
the meson buildsystem enables additional steps for building, testing and packaging.
meson uses out of source build directories, which has various advantages.
create one with:
meson setup --wipe -D test_datadir=path_to_datadir/ -D external_deps=path_to_external_deps/ path_to_builddir/
cd /path_to_builddir/
meson compile
test setup: todo
now configure the tests with them:
cd /path_to_builddir
meson configure -Dtest_datadir=/path_to_test_samplesheet_dir
run tests
meson test
instead of conda or containers, the required tools within the pipeline can be installed locally, see:
scripts/build_deps.sh
scripts/install_deps.sh
another way is to use a single container for all processes or for the whole pipeline, see:
scripts/gen_container.sh
export NXF_HOME='$HOME/.nextflow'
export NXF_ASSETS='$NXF_HOME/assets'
export NXF_OPTS='-Dlog=/path/nextflow_logs'
export NXF_TEMP=/path/nextflow_temp/
export NXF_LOG_FILE='/path/nextflow_logs'
export NXF_PLUGINS_DIR='/path/nextflow_plugins'
export NEXTFLOW_MODULES="$(pwd)/modules"