-
Notifications
You must be signed in to change notification settings - Fork 133
VCFCombineTwoSnvs
Pierre Lindenbaum edited this page Feb 16, 2016
·
3 revisions
##Motivation
Idea from @SolenaLS and then @AntoineRimbert
##Compilation
See also Compilation.
$ make vcfcombinetwosnvs
##Synopsis
$ java -jar dist/vcfcombinetwosnvs.jar [options ] (stdin|file)
- -k KnownGene data URI/File. Beware chromosome names are formatted the same as your REFERENCE. * -B Optional indexed BAM file used to get phasing information. * -o,--output output file.
- -h,--help print help
- -version,--version show version and exit
##Source Code
Main code is: https://github.com/lindenb/jvarkit/blob/master/src/main/java/com/github/lindenb/jvarkit/tools/vcfannot/VCFCombineTwoSnvs.java
##fileformat=VCFv4.2
##FILTER=<ID=TwoStrands,Description="(number of reads carrying both mutation) < (reads carrying variant 1 + reads carrying variant 2)">
##INFO=<ID=CodonVariant,Number=.,Type=String,Description="Variant affected by two distinct mutation. Format is defined in the INFO column. INFO_AC:Allele count in genotypes, for each ALT allele, in the same order as listed.INFO_AF:Allele Frequency, for each ALT allele, in the same order as listed.INFO_MLEAC:Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed.INFO_MLEAF:Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed.">
##VCFCombineTwoSnvsCmdLine=-k jeter.knownGene.txt -tmpdir tmp/ -R /commun/data/pubdb/broadinstitute.org/bundle/1.5/b37/human_g1k_v37.fasta -B /commun/data/projects/plateforme/NTS-017_HAL_Schott_mitral/20141106/align20141106/Samples/CD13314/BAM/Haloplex20141106_CD13314_final.bam
##VCFCombineTwoSnvsHtsJdkHome=/commun/data/packages/htsjdk/htsjdk-2.0.1
##VCFCombineTwoSnvsHtsJdkVersion=2.0.1
##VCFCombineTwoSnvsVersion=c5af7d1bd367562b3578d427d24ec62856835d38
#CHROM POS ID REF ALT QUAL FILTER INFO
1 120612013 rs200646249 G A . . CodonVariant=CHROM|1|REF|G|TRANSCRIPT|uc001eil.3|cDdnaPos|8|CodonPos|7|CodonWild|GCC|AAPos|3|AAWild|A|POS1|120612013|ID1|rs200646249|PosInCodon1|2|Alt1|A|Codon1|GTC|AA1|V|INFO_MLEAC_1|1|INFO_AC_1|1|INFO_MLEAF_1|0.500|INFO_AF_1|0.500|POS2|120612014|ID2|.|PosInCodon2|1|Alt2|A|Codon2|TCC|AA2|S|INFO_MLEAC_2|1|INFO_AC_2|1|INFO_MLEAF_2|0.500|INFO_AF_2|0.500|CombinedCodon|TTC|CombinedAA|F|CombinedSO|nonsynonymous_variant|CombinedType|combined_is_new|N_READS_BOTH_VARIANTS|168|N_READS_NO_VARIANTS|1045|N_READS_TOTAL|1213|N_READS_ONLY_1|0|N_READS_ONLY_2|0,CHROM|1|REF|G|TRANSCRIPT|uc001eik.3|cDdnaPos|8|CodonPos|7|CodonWild|GCC|AAPos|3|AAWild|A|POS1|120612013|ID1|rs200646249|PosInCodon1|2|Alt1|A|Codon1|GTC|AA1|V|INFO_MLEAC_1|1|INFO_AC_1|1|INFO_MLEAF_1|0.500|INFO_AF_1|0.500|POS2|120612014|ID2|.|PosInCodon2|1|Alt2|A|Codon2|TCC|AA2|S|INFO_MLEAC_2|1|INFO_AC_2|1|INFO_MLEAF_2|0.500|INFO_AF_2|0.500|CombinedCodon|TTC|CombinedAA|F|CombinedSO|nonsynonymous_variant|CombinedType|combined_is_new|N_READS_BOTH_VARIANTS|168|N_READS_NO_VARIANTS|1045|N_READS_TOTAL|1213|N_READS_ONLY_1|0|N_READS_ONLY_2|0;EXAC03_AC_NFE=641;EXAC03_AN_NFE=48948
1 120612014 . C A . . CodonVariant=CHROM|1|REF|C|TRANSCRIPT|uc001eik.3|cDdnaPos|7|CodonPos|7|CodonWild|GCC|AAPos|3|AAWild|A|POS1|120612014|ID1|.|PosInCodon1|1|Alt1|A|Codon1|TCC|AA1|S|INFO_MLEAC_1|1|INFO_AC_1|1|INFO_MLEAF_1|0.500|INFO_AF_1|0.500|POS2|120612013|ID2|rs200646249|PosInCodon2|2|Alt2|A|Codon2|GTC|AA2|V|INFO_MLEAC_2|1|INFO_AC_2|1|INFO_MLEAF_2|0.500|INFO_AF_2|0.500|CombinedCodon|TTC|CombinedAA|F|CombinedSO|nonsynonymous_variant|CombinedType|combined_is_new|N_READS_BOTH_VARIANTS|168|N_READS_NO_VARIANTS|1045|N_READS_TOTAL|1213|N_READS_ONLY_1|0|N_READS_ONLY_2|0,CHROM|1|REF|C|TRANSCRIPT|uc001eil.3|cDdnaPos|7|CodonPos|7|CodonWild|GCC|AAPos|3|AAWild|A|POS1|120612014|ID1|.|PosInCodon1|1|Alt1|A|Codon1|TCC|AA1|S|INFO_MLEAC_1|1|INFO_AC_1|1|INFO_MLEAF_1|0.500|INFO_AF_1|0.500|POS2|120612013|ID2|rs200646249|PosInCodon2|2|Alt2|A|Codon2|GTC|AA2|V|INFO_MLEAC_2|1|INFO_AC_2|1|INFO_MLEAF_2|0.500|INFO_AF_2|0.500|CombinedCodon|TTC|CombinedAA|F|CombinedSO|nonsynonymous_variant|CombinedType|combined_is_new|N_READS_BOTH_VARIANTS|168|N_READS_NO_VARIANTS|1045|N_READS_TOTAL|1213|N_READS_ONLY_1|0|N_READS_ONLY_2|0;EXAC03_AC_NFE=640;EXAC03_AN_NFE=48228
KEY | EXAMPLE | DESC |
---|---|---|
CHROM | 1 | Chromosome for current variant. |
REF | C | Reference Allele for current variant |
TRANSCRIPT | uc001eik.3 | UCSC knownGene Transcript |
cDdnaPos | 7 | +1 based position in cDNA |
CodonPos | 7 | +1 based position of the codon in cNA |
CodonWild | GCC | Wild codon |
AAPos | 3 | +1 based position of amino acid |
AAWild | A | Wild amino acid |
POS1 | 120612014 | +1 based position of variant 1 |
ID1 | . | RS ID of variant 1 |
PosInCodon1 | 1 | Position in codon (1,2,3) of variant 1 |
Alt1 | A | Alternate allele of variant 1 |
Codon1 | TCC | Codon with variant 1 alone |
AA1 | S | Amino acid prediction for variant 1 |
INFO_*_1 | 1 | Data about alternate allele 1 taken out of original VCF |
POS2 | 120612013 | +1 based position of variant 2 |
ID2 | rs200646249 | RS ID of variant 2 |
PosInCodon2 | 2 | Position in codon (1,2,3) of variant 2 |
Alt2 | A | Alternate allele of variant 2 |
Codon2 | GTC | Codon with variant 2 alone |
AA2 | V | Amino acid prediction for variant 2 |
INFO_*_2 | 1 | Data about alternate allele 2 taken out of original VCF |
CombinedCodon | TTC | Combined codon with ALT1 and ALT2 |
CombinedAA | F | Combined amino acid with ALT1 and ALT2 |
CombinedSO | nonsynonymous_variant | Sequence Ontology term |
CombinedType | combined_is_new | type of new mutation |
N_READS_BOTH_VARIANTS | 168 | Number of reads carrying both variants |
N_READS_NO_VARIANTS | 1045 | Number of reads carrying no variants |
N_READS_TOTAL | 1213 | Total Number of reads |
N_READS_ONLY_1 | 0 | Number of reads only carrying variant 1 |
N_READS_ONLY_2 | 0 | Number of reads only carrying variant 2 |
- Issue Tracker: http://github.com/lindenb/jvarkit/issues
- Source Code: http://github.com/lindenb/jvarkit
##History
- 2016 : Creation
The project is licensed under the MIT license.