-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
invalid genotypes #1
Comments
Thanks for providing this example. I'm not seeing the same output as you. What operating system are you running on (and what operating system did you compile hds-util on)? I'd like to reproduce your environment. |
@jonathonl Thank you so much for getting back to me so fast. I tried both compiling and running independently on the csg and armis clusters and seem to see the issue in both envs I am not sure what details would be most helpful for you, but here are a few: csg:
armis:
Not sure if it matters but I believe in both cases the version of sav that was available at time of compiling was:
If you need any other info please let me know |
This should now be fixed with 763bb2d. Please rebuild with latest from master branch. |
@jonathonl I really appreciate the time you put into fixing that, especially so fast. It looks better on my end now. Another quick question, and sorry if I am missing it somewhere. We certainty want MAF and Rsq recomputed in our merged data, but what is the point of recomputing DS, GT , GP from HDS? Is it just so that these numbers can be recapitulated from the HDS that appears in the VCF? Regardless is it always recommended to update DS, GT , GP with |
It's recomputed for the sake of simpler code. There is a plan for future versions of the imputation server to only export HDS in the output files in order to reduce compute and storage costs. Most people don't need all four FORMAT fields, so hds-util allows you to generate only the fields needed by a user for downstream analysis. In the latest version of Minimac4, the Rsq is computed after the precision loss. But Imputation Server is still using the older version so that issue would still apply. In any case, the median difference is quite small and I suspect it would have negligible effects on Rsq filtering strategies. |
@jonathonl That makes a lot of sense thanks |
Thank you for developing this tool, it will be quite handy for us.
In my merges I have been getting invalid genotypes (eg
0/-44
) in addition a mixture of phased and unphased sites.I imputed some publicly available HGDP samples on MIS to demonstrate this issue here
Do you advice on how to proceed? Hopefully I am not doing something silly. Thanks
The text was updated successfully, but these errors were encountered: