diff --git a/module_notebooks/05-variant-calling-with-vg.ipynb b/module_notebooks/05-variant-calling-with-vg.ipynb index 10af416..f32c464 100644 --- a/module_notebooks/05-variant-calling-with-vg.ipynb +++ b/module_notebooks/05-variant-calling-with-vg.ipynb @@ -225,6 +225,34 @@ "!vg call SK1xyprp.chrVIII.pggb.aug.xg -k SK1xyprp.chrVIII.pggb.mapped.aug.pack -t 4 > SK1xyprp.chrVIII.pggb.aug_calls.vcf" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Generate stats on this VCF file. We will use `grep` to pull out the rows that start with SN." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!bcftools stats SK1xyprp.chrVIII.pggb.aug_calls.vcf | grep \"^SN\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "SNPs = single nucleotide polymorphisms (a single nucleotide change; reference and alternate alleles are all of length 1) \n", + "MNPs = multi-nucleotide polymorphisms (reference and alternate alleles are all of the same length and that length is >1) \n", + "indels = insertion/deletion (reference and alternate alleles are of different lengths)\n", + "others = more complex variants\n", + "multiallelic sites = more than one alternate allele\n", + "multiallelic SNP sites = more than one alternate allele at a SNP site" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -238,6 +266,7 @@ "