Skip to content

Commit

Permalink
Worked through novel variant section.
Browse files Browse the repository at this point in the history
  • Loading branch information
joannmudge committed Feb 7, 2025
1 parent ffcf5b4 commit 7eea9b8
Showing 1 changed file with 63 additions and 44 deletions.
107 changes: 63 additions & 44 deletions module_notebooks/05-variant-calling-with-vg.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@
"\n",
"We will look for variants that are supported by the graph as well as for variants that are novel (not in the graph but supported by the reads aligned to the graph).\n",
"\n",
"We will call variants against the graph, though you could also call variants against the S288C reference using the surjected BAM file and traditional variant calling methods."
"We will call variants against the graph, though you could also call variants using the surjected BAM file and traditional variant calling methods."
]
},
{
Expand Down Expand Up @@ -123,7 +123,7 @@
"<br>\n",
"!vg pack -x yprp.fullgenome.pggb.giraffe.gbz -g SK1xyprp.fullgenome.pggb.mapped.gam -Q 5 -s 5 -o yprp.fullgenome.pggb.mapped.pack -t 4 \n",
"\n",
" <br>\n",
"\n",
"!vg call -k yprp.fullgenome.pggb.mapped.pack -t 4 yprp.fullgenome.pggb.giraffe.gbz > yprp.fullgenome.pggb.graph_calls.vcf\n",
"</details>"
]
Expand All @@ -134,18 +134,7 @@
"source": [
"## Calling Novel Variants\n",
"\n",
"Augment the graph with the mapped reads using `vg augment`.\n",
"\n",
"The Parameters: XXX fix these params or delete them (We alread did this in another chapter?)\n",
"\n",
"-A\n",
"+ The read alignment\n",
"\n",
"-t 4\n",
"+ Use 4 threads\n",
"\n",
"\n",
"NOTE: This only supports VG files. Indexes used for mapping must be built from the same VG file being augmented (i.e. indexes built from GFA files that were then converted to VG won’t work.)\n"
"To call novel variants, those variants supported by the aligned reads, we need to embed the variation from the reads we aligned back into the graph. To do this we need to convert the graph into a form that we can change. We will use `vg convert` to convert the .gbz file to a .vg file."
]
},
{
Expand All @@ -154,16 +143,21 @@
"metadata": {},
"outputs": [],
"source": [
"!vg augment S288C.vg S288C.SK1.illumina.gam -A S288C.SK1.illumina.aug.gam -t 4 > S288C.SK1.illumina.aug.vg"
"!vg convert yprp.chrVIII.pggb.giraffe.gbz > yprp.chrVIII.pggb.giraffe.vg"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Index the augmented graph.\n",
"Now, we can augment the graph with the mapped reads using `vg augment`. This will embed the variation from the alignments back into the graph.\n",
"\n",
"XXX Do we need params here?"
"The Parameters:\n",
"\n",
"-A new, augmented graph with aligned reads\n",
"-t the number of threads to use\n",
"The graph\n",
"The input alignment (gam) file\n"
]
},
{
Expand All @@ -172,14 +166,20 @@
"metadata": {},
"outputs": [],
"source": [
"!vg index -x S288C.SK1.illumina.aug.xg S288C.SK1.illumina.aug.vg -t 4"
"!vg augment yprp.chrVIII.pggb.giraffe.vg SK1xyprp.chrVIII.pggb.mapped.gam -A SK1xyprp.chrVIII.pggb.mapped.aug.gam -t 4 > SK1xyprp.chrVIII.pggb.aug.vg "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Compute read support for novel variation."
"Index the augmented graph using `vg index`. We will make a .xg index.\n",
"\n",
"The prameters:\n",
"\n",
"-x output file\n",
"-t the number of threads \n",
"The input graph"
]
},
{
Expand All @@ -188,14 +188,16 @@
"metadata": {},
"outputs": [],
"source": [
"!vg pack -x S288C.SK1.illumina.aug.xg -g S288C.SK1.illumina.aug.gam -Q 5 -s 5 -o S288C.SK1.illumina.aug.pack -t 4"
"!vg index -t 4 -x SK1xyprp.chrVIII.pggb.aug.xg SK1xyprp.chrVIII.pggb.aug.vg"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Generate a VCF from the support."
"Now that the variation from the reads is embedded into the graph, we can procede to call variants like we did above. \n",
"\n",
"Compute read support."
]
},
{
Expand All @@ -204,29 +206,14 @@
"metadata": {},
"outputs": [],
"source": [
"!vg call S288C.SK1.illumina.aug.xg -k S288C.SK1.illumina.aug.pack -t 4 > S288C.SK1.illumina.aug_calls.vcf"
"!vg pack -x SK1xyprp.chrVIII.pggb.aug.xg -g SK1xyprp.chrVIII.pggb.mapped.aug.gam -Q 5 -s 5 -o SK1xyprp.chrVIII.pggb.mapped.aug.pack -t 4"
]
},
{
"cell_type": "code",
"execution_count": null,
"cell_type": "markdown",
"metadata": {},
"outputs": [],
"source": [
"## Calling Variants Already in the Graph using read support XXX\n",
"\n",
"Output variants used to construct graph\n",
"\n",
"\n",
"**vg deconstruct**\n",
"\n",
"The parameters:\n",
"\n",
"-P S288C\n",
" + report variants relative to paths with names that start with S288C (XXX expand)\n",
"\n",
"NOTE: S288C.deconstruct.vcf might not be identical to S288C.vcf because VG takes liberties with variants when constructing the graph.\n",
" XXX Remind people the differnce in how the 2 were made.\n"
"Generate a VCF from the support."
]
},
{
Expand All @@ -235,18 +222,44 @@
"metadata": {},
"outputs": [],
"source": [
"!vg deconstruct S288C.xg -P S288C -t 4 > S288C.deconstruct.vcf"
"!vg call SK1xyprp.chrVIII.pggb.aug.xg -k SK1xyprp.chrVIII.pggb.mapped.aug.pack -t 4 > SK1xyprp.chrVIII.pggb.aug_calls.vcf"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<div class=\"alert alert-block alert-info\"> <b>Try this:</b> \n",
" <ul>\n",
" <li>Create a blank code cell below.</li>\n",
" <li>Call novel variants for the yprp.fullgenome.pggb.giraffe.gbz graph.</li>\n",
" <li>+ Convert the graph to vg format.</li>\n",
" <li>+ Augment the graph to embed the read alignments into it.</li>\n",
" <li>+ Create an index (xg).</li>\n",
" <li>+ Compute read support.</li>\n",
" <li>+ Generate a VCF.</li>\n",
" </ul>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exercises\n",
"<details>\n",
"<summary>Click for help</summary>\n",
"<br>\n",
"!vg convert yprp.fullgenome.pggb.giraffe.gbz > yprp.fullgenome.pggb.giraffe.vg\n",
"\n",
"!vg augment yprp.fullgenome.pggb.giraffe.vg SK1xyprp.fullgenome.pggb.mapped.gam -A SK1xyprp.fullgenome.pggb.mapped.aug.gam -t 4 > SK1xyprp.fullgenome.pggb.aug.vg \n",
"\n",
"!vg index -t 4 -x SK1xyprp.fullgenome.pggb.aug.xg SK1xyprp.fullgenome.pggb.aug.vg\n",
"\n",
"1. Use vg to index the chromosome VIII graph\n",
"2. Use vg to map SK1 reads to the chromosome VIII graph\n",
"3. Use vg to call variants on chromosome VIII read mapping GAMS\n"
"!vg pack -x SK1xyprp.fullgenome.pggb.aug.xg -g SK1xyprp.fullgenome.pggb.mapped.aug.gam -Q 5 -s 5 -o SK1xyprp.fullgenome.pggb.mapped.aug.pack -t 4\n",
"\n",
"!vg call SK1xyprp.fullgenome.pggb.aug.xg -k SK1xyprp.fullgenome.pggb.mapped.aug.pack -t 4 > SK2xyprp.fullgenome.pggb.aug_calls.vcf\n",
"\n",
"\n",
"</details>"
]
},
{
Expand All @@ -268,6 +281,12 @@
}
],
"metadata": {
"environment": {
"kernel": "conda-env-nigms-pangenomics-nigms-pangenomics",
"name": "workbench-notebooks.m127",
"type": "gcloud",
"uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-notebooks:m127"
},
"kernelspec": {
"display_name": "nigms-pangenomics",
"language": "python",
Expand Down

0 comments on commit 7eea9b8

Please sign in to comment.