Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

appropriate basecaller_cfg not found #115

Open
rowi2024 opened this issue Nov 7, 2024 · 2 comments
Open

appropriate basecaller_cfg not found #115

rowi2024 opened this issue Nov 7, 2024 · 2 comments
Labels
question Further information is requested

Comments

@rowi2024
Copy link

rowi2024 commented Nov 7, 2024

Ask away!

Hello,
I'm trying to run some older data through the wf-artic pipeline. I used Guppy to basecall this data with model
dna_r9.4.1_450bps_sup.cfg.

image

Epi2me version:
image

When I run my nextflow command, I get an error that it cannot automatically determine the basecalling model used. (why is that, and where should this information be found?)

I found a suggestion in the Github issues to use --override_basecaller_cfg. However, dna_r9.4.1_540bps_sup is not an option.
Can you please let me know which model I should use?

Here is the nextflow command that I am trying to use:

image

I tried to run this with the hac model instead. It seemed to be working, but ultimately, it failed and no output files were written. Is this because I used the wrong model?

Here is the start of the error messages:

image

By the way, I was able to somewhat successfully run the test data (without the override command). Even though this failed to output any artic, nextclade or pangolin analyses, I really only want the Read QC and the primertrimmed and sorted bam files, which I could see in the test output.

Thanks!

@rowi2024 rowi2024 added the question Further information is requested label Nov 7, 2024
@mattdmem
Copy link
Contributor

mattdmem commented Nov 8, 2024

Hello - I'd be tempted to use "[email protected]"

Thanks

Matt

@rowi2024
Copy link
Author

rowi2024 commented Nov 8, 2024

Hi Matt,
Thanks! That (almost) worked;-) I do see the trimmed and sorted bam files now, but there are still some issues:

  1. The run ended without producing a report. With the demo data, I could produce the report showing coverage, etc (although it failed at variant steps). I was using --update-data false, since I had issues when trying to run the default with pangolin update. This doesn't seem to really solve the problem.

  2. The trimmed and sorted bam files are quite sparse (compared to results I got when trimming with iVar). Can you share the parameters that are used to filter and trim the reads?

  3. Although the pipeline ran further than before, I noticed in one of the sample log files that it still is complaining about the basecaller model:
    ValueError: Model [email protected]:consensus is not a known model or existant file.


Hi Matt,
Sorry that I keep editing this response and adding more stuff. I just saw your nice post about the Artic pipeline: https://labs.epi2me.io/sarscov2-midnight-analysis/

From this I realized that we are only keeping 200 reads, which is probably what accounts for the small size of the bam files!
I'm actually working on wastewater samples, so would like to keep all the data. Is there any way to do this?
Just for some background, we typically use illumina sequencing for this, and freyja analysis, but we were interested in capturing some longer amplicons for phasing. I started out trimming with iVar, which feeds into freyja, but I was concerned about the primer trimming because of the tagmentation approach in the Midnight kit. It wasn't clear to me if iVar could properly interpret reads that may have had primers cut off by tagmentation. Anyway, I thought it would be safer to use an ONT tool for this. But it would be helpful if I could maintain all the reads. (By the way, I usually don't do much of the data processing, so sort of a novice at this. Thanks for your patience)

Thanks for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Development

No branches or pull requests

2 participants