Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Capture uploaded allele correctly for VCF input #1744
base: main
Are you sure you want to change the base?
Capture uploaded allele correctly for VCF input #1744
Changes from all commits
439731a
11adcb8
3538ac4
5f1d345
6794d24
c039ac7
b7b2013
8ad71eb
d3d655c
df56c61
56d965a
a8fbad3
89c3a7d
93c1ce9
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multi-allelic is not getting minimised for default format. For example -
1 961320 961324 GCAGG/GCA/GCAG +
But in the output still getting
MINIMISED=1
, (without the PR they are also not minimised but there is noMINIMISED=1
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @nakib103 , can you please test this example with the latest commit. The allele is expected to be similar to when running
--minimal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @likhitha-surapaneni,
Yes, the allele are same when running with or without
--minimal
.But the original problem remains. The output says it is minimised but when it is not -
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @nakib103 , the output is still minimised if you notice the 3rd column (correct me if I got it wrong). We have the allele "-" as the minimised representation for first alternative allele is GG/- and for the second one is G/-. The problem however is that there is no way to differentiate between the alternative alleles as both show "-". Ideally minimal representation should be
GG/-/G
. This is also an existing problem with the option--minimal
and needs to be addressed probably in a future ticket. Please let me know if this makes senseThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually looking at the
Uploaded_variation
column, it does not seem to be minimised as it does for bi-allelic variants (see -1 961320 961324 GCAGG/GCA +
).But the
Allele
column shows the alleles are minimised. It seems Uploaded_variation string has different logic, should we address this?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree @nakib103. The uploaded variation was not expected to minimise alleles but it was minimising the alleles in some cases due to the way we populate the column and it was difficult to correct this as it is also the default option, hence the approach was taken to capture original allele with a different flag (
--uploaded_allele
). However, I agree it may also be worth investigating why uploaded variation is minimising the allele in some cases and not doing it in other cases.