You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am just having a look at your pipeline and it seems really interesting. I gave it a test run and it did produce some results. However, I am having some issue with busco. I am running the nextflow pipeline with singularity and I would like to use a pre-installed busco database, made available by our system administrators. This database location is of course read only. I did specify the busco database in the nextflow config:
So it could be solved by passing ' --offline' to busco, but there is no place to put this flag in your nextflow config file.
What would you suggest?
Here is the pipeline output for my command:
nextflow run plant-food-research-open/assemblyqc -revision 2.2.1 -profile mpcdf,raven --input assemblysheet.csv --outdir result --busco_lineage_datasets aves_odb10
...
executor > slurm (6)
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:GUNZIP_FASTA -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:GUNZIP_GFF3 -
[64/4ebaac] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTAVALIDATOR (bPacMac) [100%] 1 of 1 ✔
[34/995987] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:SEQKIT_RMDUP (bPacMac) [100%] 1 of 1 ✔
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:GFF3_GT_GFF3_GFF3VALIDATOR_STAT:GT_GFF3 -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:GFF3_GT_GFF3_GFF3VALIDATOR_STAT:GT_GFF3VALIDATOR -
[27/b6465f] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:GFF3_GT_GFF3_GFF3VALIDATOR_STAT:SAMTOOLS_FAIDX (yahs_scaffolds_final.fa) [100%] 1 of 1 ✔
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:GFF3_GT_GFF3_GFF3VALIDATOR_STAT:GT_STAT -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FCS_FCSADAPTOR -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:NCBI_FCS_GX:NCBI_FCS_GX_SETUP_SAMPLE -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:NCBI_FCS_GX:NCBI_FCS_GX_SCREEN_SAMPLES -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:NCBI_FCS_GX:NCBI_FCS_GX_KRONA_PLOT -
[75/bb01d8] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:TAG_ASSEMBLY (bPacMac) [100%] 1 of 1 ✔
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FETCHNGS:CUSTOM_SRATOOLSNCBISETTINGS -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FETCHNGS:SRATOOLS_PREFETCH -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FETCHNGS:SRATOOLS_FASTERQDUMP -
[ad/6b0ce6] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:ASSEMBLATHON_STATS (bPacMac) [100%] 1 of 1 ✔
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:GFASTATS -
[66/d51db8] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_GXF_BUSCO_PLOT:BUSCO_ASSEMBLY (bPacMac) [100%] 1 of 1, failed: 1 ✘
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_GXF_BUSCO_PLOT:PLOT_ASSEMBLY -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_GXF_BUSCO_PLOT:EXTRACT_PROTEINS -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_GXF_BUSCO_PLOT:BUSCO_ANNOTATION -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_GXF_BUSCO_PLOT:PLOT_ANNOTATION -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_EXPLORE_SEARCH_PLOT_TIDK:FILTER_BY_LENGTH -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_EXPLORE_SEARCH_PLOT_TIDK:SORT_BY_LENGTH -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_EXPLORE_SEARCH_PLOT_TIDK:TIDK_EXPLORE -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_EXPLORE_SEARCH_PLOT_TIDK:TIDK_SEARCH_APRIORI -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_EXPLORE_SEARCH_PLOT_TIDK:TIDK_SEARCH_APOSTERIORI -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_EXPLORE_SEARCH_PLOT_TIDK:TIDK_PLOT_APRIORI -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_EXPLORE_SEARCH_PLOT_TIDK:TIDK_PLOT_APOSTERIORI -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_LTRRETRIEVER_LAI:UNMASK_IF_ANY -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_LTRRETRIEVER_LAI:CUSTOM_SHORTENFASTAIDS -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_LTRRETRIEVER_LAI:LTRHARVEST -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_LTRRETRIEVER_LAI:LTRFINDER -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_LTRRETRIEVER_LAI:CAT_CAT -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_LTRRETRIEVER_LAI:LTRRETRIEVER_LTRRETRIEVER -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_LTRRETRIEVER_LAI:LTRRETRIEVER_LAI -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_LTRRETRIEVER_LAI:CUSTOM_RESTOREGFFIDS -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_KRAKEN2:UNTAR -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_KRAKEN2:KRAKEN2 -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_KRAKEN2:KRAKEN2_KRONA_PLOT -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FQ2HIC:FASTQ_FASTQC_UMITOOLS_FASTP:FASTQC_RAW -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FQ2HIC:FASTQ_FASTQC_UMITOOLS_FASTP:FASTP -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FQ2HIC:FASTQ_FASTQC_UMITOOLS_FASTP:FASTQC_TRIM -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FQ2HIC:SEQKIT_SORT -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FQ2HIC:FASTQ_BWA_MEM_SAMBLASTER:BWA_INDEX -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FQ2HIC:FASTQ_BWA_MEM_SAMBLASTER:BWA_MEM -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FQ2HIC:FASTQ_BWA_MEM_SAMBLASTER:SAMBLASTER -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FQ2HIC:HICQC -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FQ2HIC:MAKEAGPFROMFASTA -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FQ2HIC:AGP2ASSEMBLY -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FQ2HIC:ASSEMBLY2BEDPE -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FQ2HIC:MATLOCK_BAM2_JUICER -
[- ] process > PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FQ2HIC:JUICER_SORT -
Plus 30 more processes waiting for tasks…
Execution cancelled -- Finishing pending tasks before exit
-[plant-food-research-open/assemblyqc] Pipeline completed with errors-
ERROR ~ Error executing process > 'PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_GXF_BUSCO_PLOT:BUSCO_ASSEMBLY (bPacMac)'
Caused by:
Process `PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_GXF_BUSCO_PLOT:BUSCO_ASSEMBLY (bPacMac)` terminated with an error exit status (1)
Command executed:
# Nextflow changes the container --entrypoint to /bin/bash (container default entrypoint: /usr/local/env-execute)
# Check for container variable initialisation script and source it.
if [ -f "/usr/local/env-activate.sh" ]; then
set +u # Otherwise, errors out because of various unbound variables
. "/usr/local/env-activate.sh"
set -u
fi
# If the augustus config directory is not writable, then copy to writeable area
if [ ! -w "${AUGUSTUS_CONFIG_PATH}" ]; then
# Create writable tmp directory for augustus
AUG_CONF_DIR=$( mktemp -d -p $PWD )
cp -r $AUGUSTUS_CONFIG_PATH/* $AUG_CONF_DIR
export AUGUSTUS_CONFIG_PATH=$AUG_CONF_DIR
echo "New AUGUSTUS_CONFIG_PATH=${AUGUSTUS_CONFIG_PATH}"
fi
# Ensure the input is uncompressed
INPUT_SEQS=input_seqs
mkdir "$INPUT_SEQS"
cd "$INPUT_SEQS"
for FASTA in ../tmp_input/*; do
if [ "${FASTA##*.}" == 'gz' ]; then
gzip -cdf "$FASTA" > $( basename "$FASTA" .gz )
else
ln -s "$FASTA" .
fi
done
cd ..
busco \
--cpu 6 \
--in "$INPUT_SEQS" \
--out bPacMac-aves_odb10-busco \
--mode genome \
--lineage_dataset aves_odb10 \
--download_path busco \
\
--metaeuk
# clean up
rm -rf "$INPUT_SEQS"
# Move files to avoid staging/publishing issues
mv bPacMac-aves_odb10-busco/batch_summary.txt bPacMac-aves_odb10-busco.batch_summary.txt
mv bPacMac-aves_odb10-busco/*/short_summary.*.{json,txt} . || echo "Short summaries were not available: No genes were found."
cat <<-END_VERSIONS > versions.yml
"PLANTFOODRESEARCHOPEN_ASSEMBLYQC:ASSEMBLYQC:FASTA_GXF_BUSCO_PLOT:BUSCO_ASSEMBLY":
busco: $( busco --version 2>&1 | sed 's/^BUSCO //' )
END_VERSIONS
Command exit status:
1
Command output:
2025-01-31 11:49:32 INFO: ***** Start a BUSCO v5.7.1 analysis, current time: 01/31/2025 11:49:32 *****
2025-01-31 11:49:32 INFO: Configuring BUSCO with local environment
2025-01-31 11:49:32 INFO: Running genome mode
2025-01-31 11:49:32 INFO: Downloading information on latest versions of BUSCO data...
Command error:
INFO: Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
INFO: Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
INFO: Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
INFO: fuse2fs not found, will not be able to mount EXT3 filesystems
Traceback (most recent call last):
File "/usr/local/bin/busco", line 54, in <module>
run_BUSCO.main()
File "/usr/local/lib/python3.7/site-packages/busco/run_BUSCO.py", line 502, in main
busco_run.run()
File "/usr/local/lib/python3.7/site-packages/busco/run_BUSCO.py", line 68, in run
self.load_config()
File "/usr/local/lib/python3.7/site-packages/busco/run_BUSCO.py", line 60, in load_config
self.config_manager.load_busco_config_main()
File "/usr/local/lib/python3.7/site-packages/busco/BuscoLogger.py", line 62, in wrapped_func
self.retval = func(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/busco/ConfigManager.py", line 63, in load_busco_config_main
self.config_main.validate()
File "/usr/local/lib/python3.7/site-packages/busco/BuscoConfig.py", line 640, in validate
self._init_downloader()
File "/usr/local/lib/python3.7/site-packages/busco/BuscoConfig.py", line 440, in _init_downloader
self.downloader = BuscoDownloadManager(self)
File "/usr/local/lib/python3.7/site-packages/busco/BuscoDownloadManager.py", line 53, in __init__
self._obtain_versions_file()
File "/usr/local/lib/python3.7/site-packages/busco/BuscoLogger.py", line 62, in wrapped_func
self.retval = func(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/busco/BuscoDownloadManager.py", line 77, in _obtain_versions_file
urllib.request.urlretrieve(remote_filepath, local_filepath)
File "/usr/local/lib/python3.7/urllib/request.py", line 257, in urlretrieve
tfp = open(filename, 'wb')
OSError: [Errno 30] Read-only file system: 'busco/file_versions.tsv'
Work dir:
/raven/ptmp/luelze/nextflow/assemblyqc/work/66/d51db878fc06c903e20e457c9f0974
Container:
/u/luelze/sw/nextflow/cache/depot.galaxyproject.org-singularity-busco-5.7.1--pyhdfd78af_0.img
Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
-- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting
-- Check '.nextflow.log' file for details
Command used and terminal output
Relevant files
No response
System information
No response
The text was updated successfully, but these errors were encountered:
Thank you @AlcaArctica for the issue. Yes, the busco offline mode is not supported through the pipeline parameters. Nonetheless, you can turn it on by creating a custom.config file with following contents and pass it via the -c parameter.
custom.config
process {
withName: 'BUSCO_BUSCO' {
ext.args ='--metaeuk --offline'
}
}
Updated command
nextflow run plant-food-research-open/assemblyqc -revision 2.2.1 -profile mpcdf,raven -c /path/to/custom.config --input assemblysheet.csv --outdir result --busco_lineage_datasets aves_odb10
Description of the bug
Hi, I am just having a look at your pipeline and it seems really interesting. I gave it a test run and it did produce some results. However, I am having some issue with busco. I am running the nextflow pipeline with singularity and I would like to use a pre-installed busco database, made available by our system administrators. This database location is of course read only. I did specify the busco database in the nextflow config:
but I get the following error OSError: [Errno 30] Read-only file system: 'busco/file_versions.tsv'.
I believe this issue is described here: https://gitlab.com/ezlab/busco/-/issues/560
So it could be solved by passing ' --offline' to busco, but there is no place to put this flag in your nextflow config file.
What would you suggest?
Here is the pipeline output for my command:
nextflow run plant-food-research-open/assemblyqc -revision 2.2.1 -profile mpcdf,raven --input assemblysheet.csv --outdir result --busco_lineage_datasets aves_odb10
Command used and terminal output
Relevant files
No response
System information
No response
The text was updated successfully, but these errors were encountered: