Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mapping issue #82

Open
francesco-ricci opened this issue Jul 13, 2024 · 7 comments
Open

mapping issue #82

francesco-ricci opened this issue Jul 13, 2024 · 7 comments

Comments

@francesco-ricci
Copy link

Hi VIni,

I keep getting this error message:

[Sat Jul 13 01:49:37 2024]
Error in rule concatenate_contigs:
jobid: 56
input: output/assembly/megahit/coassembly/coassembly.contigs_renamed.fa
output: output/mapping/cobinning/cobinning_contigs_catalogue.fna.gz
log: output/logs/mapping/concatenate_contigs/cobinning.log (check log file(s) for error details)
conda-env: /home/fricci/rp24/fra/software/miniconda3/envs/metaphor/lib/python3.11/site-packages/metaphor/config/conda/d2f1a35ff4b889a5c9b33f580fd56824_
shell:

concatenate.py -m 1000 output/mapping/cobinning/cobinning_contigs_catalogue.fna.gz output/assembly/megahit/coassembly/coassembly.contigs_renamed.fa &> output/logs/mapping/concatenate_contigs/cobinning.log

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

MissingOutputException in rule assembly_report in file /home/fricci/rp24/fra/software/miniconda3/envs/metaphor/lib/python3.11/site-packages/metaphor/workflow/rules/assembly.smk, line 103:
Job 52 completed successfully, but some output files are missing. Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait:
output/assembly/assembly_report/avg_length.png (missing locally, parent dir contents: )
output/assembly/assembly_report/max_length.png (missing locally, parent dir contents: )
output/assembly/assembly_report/median_length.png (missing locally, parent dir contents: )
output/assembly/assembly_report/no_bp.png (missing locally, parent dir contents: )
output/assembly/assembly_report/no_contigs.png (missing locally, parent dir contents: )
output/assembly/assembly_report/n50.png (missing locally, parent dir contents: )
output/assembly/assembly_report/assembly_report.tsv (missing locally, parent dir contents: )
Removing output files of failed job concatenate_contigs since they might be corrupted:
output/mapping/cobinning/cobinning_contigs_catalogue.fna.gz
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-07-10T173408.447387.snakemake.log
WorkflowError:
At least one job did not complete successfully.
Traceback (most recent call last):
File "/home/fricci/rp24/fra/software/miniconda3/envs/metaphor/bin/metaphor", line 10, in
sys.exit(main())
^^^^^^
File "/home/fricci/rp24/fra/software/miniconda3/envs/metaphor/lib/python3.11/site-packages/metaphor/cli/cli.py", line 278, in main
args.func(args)
File "/home/fricci/rp24/fra/software/miniconda3/envs/metaphor/lib/python3.11/site-packages/metaphor/cli/execute.py", line 109, in main
retcode = run_cmd(cmd)
^^^^^^^^^^^^
File "/home/fricci/rp24/fra/software/miniconda3/envs/metaphor/lib/python3.11/site-packages/metaphor/utils.py", line 92, in run_cmd
retcode = check_call(cmd.split())
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fricci/rp24/fra/software/miniconda3/envs/metaphor/lib/python3.11/subprocess.py", line 413, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['snakemake', '--snakefile', '/home/fricci/rp24/fra/software/miniconda3/envs/metaphor/lib/python3.11/site-packages/metaphor/workflow/Snakefile', '--configfile', 'metaphor_set>

This is what the output of logs/mapping/cobinning.log looks like:

Traceback (most recent call last):
File "/home/fricci/rp24/fra/software/miniconda3/envs/checkm/bin/concatenate.py", line 42, in
vamb.vambtools.concatenate_fasta(filehandle, args.inpaths, minlength=args.minlength,
File "/home/fricci/rp24/fra/software/miniconda3/envs/checkm/lib/python3.10/site-packages/vamb/vambtools.py", line 555, in concatenate_fasta
for entry in entries:
File "/home/fricci/rp24/fra/software/miniconda3/envs/checkm/lib/python3.10/site-packages/vamb/vambtools.py", line 272, in byte_iterfasta
raise ValueError('Empty or outcommented file')
ValueError: Empty or outcommented file

Can you help me with this?
Thanks
Francesco

@francesco-ricci
Copy link
Author

Actually I found that for some reason the file output/assembly/megahit/coassembly/coassembly.contigs_renamed.fa was 0 bytes. I am running metaphor again, let's see if it fixes it.

I'll keep you posted!

@francesco-ricci
Copy link
Author

Unfortunately, it does not work. Metaphor keeps creating an empty assembly/megahit folder (ratherthan empty, actually files are there but are 0 bytes). Can you please help me troubleshooting this Vini?

@vinisalazar
Copy link
Owner

Hi @francesco-ricci, this looks like a problem with the assembly. Could you please post the log of the assembly/megahit process here?

Thank you!

@francesco-ricci
Copy link
Author

Hey Vini,

thanks for your help. I restarted the full pipeline, if it'll bump into this again I'll post it.

Thanks
Francesco

@francesco-ricci
Copy link
Author

Hi Vini,

the problem happened again. This is the assembly report:

2024-07-16 07:51:36 - MEGAHIT v1.2.9
2024-07-16 07:51:36 - Using megahit_core with POPCNT and BMI2 support
2024-07-16 07:51:36 - Convert reads to binary library
2024-07-16 07:51:36 - command /fs04/rp24/fra/software/miniconda3/envs/metaphor/lib/python3.11/site-packages/metaphor/config/conda/f98ddf6c1cb52f3d784c38b4fd62b9fd_/b>
2024-07-16 07:51:36 - b'INFO sequence/io/sequence_lib.cpp : 75 - Lib 0 (/fs04/rp24/fra/analyses/coral_chemosynthesis/tandon2023/p_lutea/output/qc/filtered/SRR208>
2024-07-16 07:51:36 - b'INFO sequence/io/sequence_lib.cpp : 75 - Lib 1 (/fs04/rp24/fra/analyses/coral_chemosynthesis/tandon2023/p_lutea/output/qc/filtered/SRR208>
2024-07-16 07:51:36 - b'INFO sequence/io/sequence_lib.cpp : 75 - Lib 2 (/fs04/rp24/fra/analyses/coral_chemosynthesis/tandon2023/p_lutea/output/qc/filtered/SRR208>
2024-07-16 07:51:36 - b'INFO sequence/io/sequence_lib.cpp : 75 - Lib 3 (/fs04/rp24/fra/analyses/coral_chemosynthesis/tandon2023/p_lutea/output/qc/filtered/SRR208>
2024-07-16 07:51:36 - b'INFO sequence/io/sequence_lib.cpp : 75 - Lib 4 (/fs04/rp24/fra/analyses/coral_chemosynthesis/tandon2023/p_lutea/output/qc/filtered/SRR208>
2024-07-16 07:51:36 - b'INFO sequence/io/sequence_lib.cpp : 75 - Lib 5 (/fs04/rp24/fra/analyses/coral_chemosynthesis/tandon2023/p_lutea/output/qc/filtered/SRR208>
2024-07-16 07:51:36 - b'INFO utils/utils.h : 152 - Real: 0.0160\tuser: 0.0030\tsys: 0.0030\tmaxrss: 13076'
2024-07-16 07:51:36 - k-max reset to: 127
2024-07-16 07:51:36 - Start assembly. Number of CPU threads 128
2024-07-16 07:51:36 - k list: 27,37,47,57,67,77,87,97,107,117,127
2024-07-16 07:51:36 - Memory used: 1947211993497
2024-07-16 07:51:36 - Extracting solid (k+1)-mers and building sdbg for k = 27
2024-07-16 07:51:36 - command /fs04/rp24/fra/software/miniconda3/envs/metaphor/lib/python3.11/site-packages/metaphor/config/conda/f98ddf6c1cb52f3d784c38b4fd62b9fd_/b>
2024-07-16 07:51:36 - b'INFO sequence/io/sequence_lib.cpp : 115 - Before reading, sizeof seq_package: 372'
2024-07-16 07:51:36 - b'INFO sequence/io/sequence_lib.cpp : 117 - After reading, sizeof seq_package: 372'
2024-07-16 07:51:36 - b'INFO sorting/read_to_sdbg_s1.cpp : 103 - 8 reads, 150 max read length, 1200 total bases'
2024-07-16 07:51:36 - b'INFO sorting/read_to_sdbg_s1.cpp : 109 - 2 words per substring'
2024-07-16 07:51:36 - b'INFO sorting/read_to_sdbg_s1.cpp : 123 - Number of files for mercy candidate reads: 1'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 148 - Preparing data...'
2024-07-16 07:51:36 - b'INFO sorting/read_to_sdbg_s2.cpp : 102 - 2 words per substring, words per dummy node ($v): 2'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 153 - Preparing data... Done. Time elapsed: 0.0358'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 156 - Preparing partitions and calculating bucket sizes...'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 80 - Minimum memory required: 496 bytes'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 138 - Lv1 items: 266, Lv2 items: 384'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 139 - Memory of derived class: 412, Memory for Lv1+Lv2: 4136'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 165 - Preparing partitions and calculating bucket sizes... Done. Time elapsed: 0.0396'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 172 - Start main loop...'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 185 - Lv1 scanning from bucket 0 to 517'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 192 - Lv1 scanning done. Large diff: 0. Time elapsed: 0.0016'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 198 - Lv1 fetching & sorting done. Time elapsed: 0.0018'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 185 - Lv1 scanning from bucket 517 to 1295'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 192 - Lv1 scanning done. Large diff: 0. Time elapsed: 0.0004'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 198 - Lv1 fetching & sorting done. Time elapsed: 0.0009'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 185 - Lv1 scanning from bucket 1295 to 65536'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 192 - Lv1 scanning done. Large diff: 0. Time elapsed: 0.0135'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 198 - Lv1 fetching & sorting done. Time elapsed: 0.0026'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 204 - Main loop done. Time elapsed: 0.0215'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 207 - Postprocessing...'
2024-07-16 07:51:36 - b'INFO sorting/read_to_sdbg_s2.cpp : 618 - Number of $ A C G T A- C- G- T-:'
2024-07-16 07:51:36 - b'INFO sorting/read_to_sdbg_s2.cpp : 619 - 16 498 507 491 488 0 0 0 0'
2024-07-16 07:51:36 - b'INFO sorting/read_to_sdbg_s2.cpp : 624 - Total number of edges: 2000'
2024-07-16 07:51:36 - b'INFO sorting/read_to_sdbg_s2.cpp : 625 - Total number of ONEs: 1984'
2024-07-16 07:51:36 - b'INFO sorting/read_to_sdbg_s2.cpp : 626 - Total number of $v edges: 16'
2024-07-16 07:51:36 - b'INFO sorting/base_engine.cpp : 210 - Postprocess done. Time elapsed: 0.0781'
2024-07-16 07:51:36 - b'INFO utils/utils.h : 152 - Real: 0.1886\tuser: 3.8188\tsys: 1.2315\tmaxrss: 495820'
2024-07-16 07:51:36 - Assemble contigs from SdBG for k = 27
2024-07-16 07:51:36 - command /fs04/rp24/fra/software/miniconda3/envs/metaphor/lib/python3.11/site-packages/metaphor/config/conda/f98ddf6c1cb52f3d784c38b4fd62b9fd_/b>
2024-07-16 07:51:36 - b'INFO main_assemble.cpp : 129 - Loading succinct de Bruijn graph: /fs04/rp24/fra/analyses/coral_chemosynthesis/tandon2023/p_lute>
2024-07-16 07:51:36 - b'INFO main_assemble.cpp : 133 - Number of Edges: 2000; K value: 27'
2024-07-16 07:51:36 - b'INFO main_assemble.cpp : 140 - Number of CPU threads: 128'
2024-07-16 07:51:36 - b'INFO assembly/sdbg_pruning.cpp : 160 - Removing tips with length less than 2; Accumulated tips removed: 0; time elapsed: 0.0011'
2024-07-16 07:51:36 - b'INFO assembly/sdbg_pruning.cpp : 160 - Removing tips with length less than 4; Accumulated tips removed: 0; time elapsed: 0.0001'
2024-07-16 07:51:36 - b'INFO assembly/sdbg_pruning.cpp : 160 - Removing tips with length less than 8; Accumulated tips removed: 0; time elapsed: 0.0001'
2024-07-16 07:51:36 - b'INFO assembly/sdbg_pruning.cpp : 160 - Removing tips with length less than 16; Accumulated tips removed: 0; time elapsed: 0.0001'
2024-07-16 07:51:36 - b'INFO assembly/sdbg_pruning.cpp : 160 - Removing tips with length less than 32; Accumulated tips removed: 0; time elapsed: 0.0001'
2024-07-16 07:51:36 - b'INFO assembly/sdbg_pruning.cpp : 169 - Removing tips with length less than 54; Accumulated tips removed: 0; time elapsed: 0.0001'
2024-07-16 07:51:36 - b'INFO main_assemble.cpp : 158 - Tips removal done! Time elapsed(sec): 0.008'
2024-07-16 07:51:36 - b'INFO assembly/unitig_graph.cpp : 83 - Graph size without loops: 8, palindrome: 0'
2024-07-16 07:51:36 - b'INFO main_assemble.cpp : 166 - unitig graph size: 8, time for building: 0.003'
2024-07-16 07:51:36 - b'INFO assembly/contig_stat.h : 40 - Max: 150, Min: 150, N50: 150, number contigs: 8, number isolated: 8, number looped: 0, total siz>
2024-07-16 07:51:36 - b'INFO main_assemble.cpp : 184 - Graph cleaning round 1'
2024-07-16 07:51:36 - b'INFO main_assemble.cpp : 200 - Number of bubbles removed: 0, Time elapsed(sec): 0.002'
2024-07-16 07:51:36 - b'INFO main_assemble.cpp : 210 - Number of complex bubbles removed: 0, Time elapsed(sec): 0.000210'
2024-07-16 07:51:36 - b'INFO main_assemble.cpp : 221 - Number unitigs disconnected: 0, time: 0.000'
2024-07-16 07:51:36 - b'INFO main_assemble.cpp : 245 - Unitigs removed in excessive pruning: 0, time: 0.000'
2024-07-16 07:51:36 - b'INFO assembly/contig_stat.h : 40 - Max: 150, Min: 150, N50: 150, number contigs: 8, number isolated: 8, number looped: 0, total siz>
2024-07-16 07:51:36 - b'INFO main_assemble.cpp : 267 - Time to output: 0.001873'
2024-07-16 07:51:36 - b'INFO main_assemble.cpp : 286 - Number of local low depth unitigs removed: 0, complex bubbles removed: 0, time: 0.000456'
2024-07-16 07:51:36 - b'INFO assembly/contig_stat.h : 40 - Max: 150, Min: 150, N50: 150, number contigs: 8, number isolated: 8, number looped: 0, total siz>
2024-07-16 07:51:36 - b'INFO utils/utils.h : 152 - Real: 0.1420\tuser: 2.0403\tsys: 0.5743\tmaxrss: 21784'
2024-07-16 07:51:37 - Local assembly for k = 27
2024-07-16 07:51:37 - command /fs04/rp24/fra/software/miniconda3/envs/metaphor/lib/python3.11/site-packages/metaphor/config/conda/f98ddf6c1cb52f3d784c38b4fd62b9fd_/b>
2024-07-16 07:51:37 - b'INFO localasm/hash_mapper.cpp : 99 - Number of contigs: 0, index size: 0'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 315 - Hash mapper construction time elapsed: 0.004956'
2024-07-16 07:51:37 - b'INFO sequence/io/sequence_lib.cpp : 115 - Before reading, sizeof seq_package: 372'
2024-07-16 07:51:37 - b'INFO sequence/io/sequence_lib.cpp : 117 - After reading, sizeof seq_package: 372'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 324 - Read lib time elapsed: 0.000929'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 133 - Lib 0, insert size: 0.00 sd: -nan'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 133 - Lib 1, insert size: 0.00 sd: -nan'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 133 - Lib 2, insert size: 0.00 sd: -nan'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 133 - Lib 3, insert size: 0.00 sd: -nan'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 133 - Lib 4, insert size: 0.00 sd: -nan'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 133 - Lib 5, insert size: 0.00 sd: -nan'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 330 - Insert size estimation time elapsed: 0.001655'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 218 - Lib 0: total 0 reads, aligned 0, added 0 reads to local assembly'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 218 - Lib 1: total 0 reads, aligned 0, added 0 reads to local assembly'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 218 - Lib 2: total 2 reads, aligned 0, added 0 reads to local assembly'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 218 - Lib 3: total 2 reads, aligned 0, added 0 reads to local assembly'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 218 - Lib 4: total 2 reads, aligned 0, added 0 reads to local assembly'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 218 - Lib 5: total 2 reads, aligned 0, added 0 reads to local assembly'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 338 - Mapping time elapsed: 0.002625'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 232 - Minimum number of reads to do local assembly: 0'
2024-07-16 07:51:37 - b'INFO localasm/local_assemble.cpp : 346 - Local assembly time elapsed: 0.005406'
2024-07-16 07:51:37 - b'INFO utils/utils.h : 152 - Real: 0.0159\tuser: 1.1029\tsys: 0.2242\tmaxrss: 13120'
2024-07-16 07:51:37 - Extract iterative edges from k = 27 to 37
2024-07-16 07:51:37 - command /fs04/rp24/fra/software/miniconda3/envs/metaphor/lib/python3.11/site-packages/metaphor/config/conda/f98ddf6c1cb52f3d784c38b4fd62b9fd_/b>
2024-07-16 07:51:37 - b'INFO main_iterate.cpp : 190 - Selected kmer type size for k: 8'
2024-07-16 07:51:37 - b'INFO main_iterate.cpp : 123 - Selected kmer type size for next k: 12'
2024-07-16 07:51:37 - b'INFO main_iterate.cpp : 138 - Processed: 8, aligned: 0. Iterative edges: 0'
2024-07-16 07:51:37 - b'INFO main_iterate.cpp : 142 - Total: 8, aligned: 0. Iterative edges: 0'
2024-07-16 07:51:37 - b'INFO utils/utils.h : 152 - Real: 0.0124\tuser: 0.7277\tsys: 0.2075\tmaxrss: 13120'
2024-07-16 07:51:37 - Merging to output final contigs
2024-07-16 07:51:37 - 0 contigs, total 0 bp, min 0 bp, max 0 bp, avg 0 bp, N50 0 bp
2024-07-16 07:51:37 - ALL DONE. Time elapsed: 0.615160 seconds

@francesco-ricci
Copy link
Author

I am running this analysis on a coral dataset using host step removal. This is the only reason why I could think the low number of aligned reads.

@vinisalazar
Copy link
Owner

@francesco-ricci that could very well be the cause. What are the sizes of the input files going into the assembler?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants