Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

svimmer and graphtyper for forced genotyping of UNION of Manta and SVIM-ASM discovered SVs #7

Open
WimSpee opened this issue Jan 26, 2022 · 0 comments

Comments

@WimSpee
Copy link

WimSpee commented Jan 26, 2022

Dear @hannespetur

Thank you and colleagues for the very nice svimmer and graphtyper software.

I would like to use svimmer and graphtyper for forced genotyping of the UNION of Manta ( many WGS) and SVIM-ASM (few assembly) discovered SVs in many WGS samples.

SVIM-ASM github
https://github.com/eldariont/svim-asm

The versions that I am using are svimmer/20211209 and graphtyper/2.7.3

When I try to get the (merged) UNION of SVs via svimmer I get this error.

Traceback (most recent call last):
  File "/tools/eb/software/Python/3.8.6-GCCcore-10.2.0/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/tools/eb/software/Python/3.8.6-GCCcore-10.2.0/lib/python3.8/multiprocessing/pool.py", line 51, in starmapstar
    return list(itertools.starmap(args[0], args[1]))
  File "/tools/eb/software/svimmer/20211209-GCC-10.2.0/svimmer", line 82, in append_svs_from_vcf
    svs.append(SV(record, check_type=not args.ignore_types, join_mode=args.join_mode, output_ids=args.ids))
  File "/tools/eb/software/svimmer/20211209-GCC-10.2.0/sv.py", line 75, in __init__
    assert False
AssertionError

svimmer/sv.py

Line 75 in f2d78b2

assert False

This is caused by svimmer not recognizing the DUP:TANDEM and DUP:INT types that SVIM-ASM outputs.

svimmer/sv.py

Line 41 in f2d78b2

# Join related SV types

I can use the svimmer argument --ignore-types to get svimmer to work.
But then graphtyper complains about Unknown SV type and I guess also drops the SVs of unknown type??

<warning> constructor.cpp:106 Unknown SV type DUP:TANDEM
<warning> constructor.cpp:106 Unknown SV type DUP:TANDEM

Would it be possible to add a mapping for DUP:TANDEM and DUP:INT in the main branch of the svimmer code here?

svimmer/sv.py

Line 41 in f2d78b2

# Join related SV types

Then the the combination of SVIM-ASM and svimmer/graphtyper would work for me and others with the same use case/combination of tools.

I also don't understand why SVs of type DUP, CNV and INV are mapped to type INS here

svimmer/sv.py

Line 45 in f2d78b2

elif info_dict["SVTYPE"] == "ALU" or info_dict["SVTYPE"] == "LINE1" or info_dict["SVTYPE"] == "SVA" or \

That does not make sense to me. INS is a novel sequence , DUP, CNV and INV are sequences already found on the reference genome and therefore also need to genotyped differently in graphtyper?

Also what I find strange is that both svimmer and graphtyper do output SVs of type DUP.
That I can't square with the mapping of DUP, CNV and INV to INS. Or maybe the SV type is re-calculated again somewhere else in svimmer/graphtyper?

Thank you for your thoughts and help on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant