Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase of initial identity threshold #1691

Open
snurk opened this issue Apr 24, 2020 · 4 comments
Open

Increase of initial identity threshold #1691

snurk opened this issue Apr 24, 2020 · 4 comments
Assignees
Labels
Milestone

Comments

@snurk
Copy link
Contributor

snurk commented Apr 24, 2020

Currently we are considering overlaps with error-rate up to 1%, which might be an overkill, considering identity distribution of the reads post-compression!
Decreasing this threshold might be important for bubble/confusion detection, because we can not easily pose stricter thresholds within individual procedures (for the overlaps, which are expected not to originate from the very same location/haplotype).
Indeed, if we ignore a 0.7% overlap of size 7k, while trying to use a 0.5% threshold, we might miss the fact that it actually had a 'suboverlap' of 0.3% of size 5k.

@snurk snurk added this to the v2.1 milestone Apr 24, 2020
@snurk
Copy link
Contributor Author

snurk commented Apr 24, 2020

In particular, posing a stricter initial threshold can help filtering out potential placements in bubble contig analysis.

@snurk
Copy link
Contributor Author

snurk commented Apr 24, 2020

This task includes reconsideration (and likely removal) of all the 'stricter' threshold that we might have experimentally introduced in individual procedures.

@snurk
Copy link
Contributor Author

snurk commented May 8, 2020

Related is possible increase of minOvlLength to 1K

@snurk
Copy link
Contributor Author

snurk commented Jun 19, 2020

Some overlaps seem to indeed be low quality (due microsatellite repeats).
Options seem to be:

  • introduce microsatellite masking to overlapper
  • store multiple overlaps so that we can ignore poor quality ones after OEA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants