-
Notifications
You must be signed in to change notification settings - Fork 133
VCFFixIndels
##Motivation
Fix samtools indels (for @SolenaLS)
##Compilation
- java 1.8 http://www.oracle.com/technetwork/java/index.html (NOT the old java 1.7 or 1.6) . Please check that this java is in the
${PATH}
. Setting JAVA_HOME is not enough : (e.g: https://github.com/lindenb/jvarkit/issues/23 ) - GNU Make > 3.81
- curl/wget
- git
- apache ant is only required to compile htsjdk
- xsltproc http://xmlsoft.org/XSLT/xsltproc2.html
$ git clone "https://github.com/lindenb/jvarkit.git"
$ cd jvarkit
$ make vcffixindels
by default, the libraries are not included in the jar file, so you shouldn't move them (https://github.com/lindenb/jvarkit/issues/15#issuecomment-140099011 ). You can create a bigger but standalone executable jar by addinging standalone=yes
on the command line:
$ git clone "https://github.com/lindenb/jvarkit.git"
$ cd jvarkit
$ make vcffixindels standalone=yes
The required libraries will be downloaded and installed in the dist
directory.
The a file local.mk can be created edited to override/add some paths.
For example it can be used to set the HTTP proxy:
http.proxy.host=your.host.com
http.proxy.port=124567
##Synopsis
$ java -jar dist/vcffixindels.jar [options] (stdin|file.vcf|file.vcf.gz)
- -o|--output (OUTPUT-FILE) Output file. Default:stdout
- -h|--help print help
- -version|--version show version and exit
##Source Code
Main code is: https://github.com/lindenb/jvarkit/blob/master/src/main/java/com/github/lindenb/jvarkit/tools/vcffixindels/VCFFixIndels.java
- https://github.com/lindenb/jvarkit/wiki/VCFFixIndels
- "Unified Representation of Genetic Variants" http://bioinformatics.oxfordjournals.org/content/early/2015/02/19/bioinformatics.btv112.abstract (hey ! it was published after I wrote this tool !)
- https://github.com/quinlan-lab/vcftidy/blob/master/vcftidy.py
- http://www.cureffi.org/2014/04/24/converting-genetic-variants-to-their-minimal-representation/
$ curl -s "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/input_callsets/si/ALL.wgs.samtools_pass_filter.20130502.snps_indels.low_coverage.sites.vcf.gz" |\
gunzip -c | java -jar dist/vcfstripannot.jar -k '*' 2> /dev/null |\
java -jar dist/vcffixindels.jar 2> /dev/null | grep FIX | head -n 15
##INFO=<ID=INDELFIXED,Number=1,Type=String,Description="Fix Indels for @SolenaLS (position|alleles...)">
1 2030133 . T TTTTGT,TTTTG 999 PASS INDELFIXED=2030101|CGTTTTGTTTTGTTTTGTTTTGTTTTGTTTTGT*|CGTTTTGTTTTGTTTTGTTTTGTTTTGTTTTGTTTTGT|CGTTTTGTTTTGTTTTGTTTTGTTTTGTTTTGTTTTG
1 3046430 . C CCCT,CCC 999 PASS INDELFIXED=3046429|TC*|TCCCT|TCCC
1 4258325 rs137902679;rs61115653 A AAT,AA 999 PASS INDELFIXED=4258316|CAAAAAAAAA*|CAAAAAAAAAA|CAAAAAAAAAAT
1 5374885 rs59294415 C CCCC,CCCCA 999 PASS INDELFIXED=5374881|TCCCC*|TCCCCCCC|TCCCCCCCA
1 5669438 rs143435517 C CACAT,CAC 999 PASS INDELFIXED=5669414|TACACACACACACACACACACACAC*|TACACACACACACACACACACACACAC|TACACACACACACACACACACACACACAT
1 5702062 . A AA,AAC 999 PASS INDELFIXED=5702060|TAA*|TAAAC|TAAA
1 5713682 rs70977965 A AAAAA,AAAAAC 999 PASS INDELFIXED=5713678|CAAAA*|CAAAAAAAA|CAAAAAAAAC
1 5911136 . T TGCCATT,TGCCATTCCAAAGAGGCACTCA 999 PASS INDELFIXED=5911135|CT*|CTGCCATTCCAAAGAGGCACTCA|CTGCCATT
1 6067269 rs34064079;rs59468731 G GG,GGC 999 PASS INDELFIXED=6067261|TGGGGGGGG*|TGGGGGGGGG|TGGGGGGGGGC
1 6069948 . TC T,TTC 999 PASS INDELFIXED=6069933|CTTTTTTTTTTTTTTTC*|CTTTTTTTTTTTTTTTTC|CTTTTTTTTTTTTTTT
1 6480784 . C CGGGCCCCAGGCTGCCCGCC,CGGGCCCCAGGCTGCCCGCCT 999 PASS INDELFIXED=6480783|GC*|GCGGGCCCCAGGCTGCCCGCCT|GCGGGCCCCAGGCTGCCCGCC
1 6829081 rs34184977;rs5772255 A AAC,AA 999 PASS INDELFIXED=6829070|TAAAAAAAAAAA*|TAAAAAAAAAAAA|TAAAAAAAAAAAAC
1 7086193 . AG A,AAG 999 PASS INDELFIXED=7086179|TAAAAAAAAAAAAAAG*|TAAAAAAAAAAAAAAAG|TAAAAAAAAAAAAAA
1 8096161 . T TATATATATAC,TAT 999 PASS INDELFIXED=8096143|CATATATATATATATATAT*|CATATATATATATATATATAT|CATATATATATATATATATATATATATAC
- Issue Tracker: http://github.com/lindenb/jvarkit/issues
- Source Code: http://github.com/lindenb/jvarkit
The project is licensed under the MIT license.
http://dx.doi.org/10.6084/m9.figshare.1425030
Lindenbaum, Pierre (2015): JVarkit: java-based utilities for Bioinformatics. figshare. http://dx.doi.org/10.6084/m9.figshare.1425030