Reduced threads and increased consistency.

ncgr · Feb 6, 2025 · 5ed5b0b · 5ed5b0b
1 parent 750f67e
commit 5ed5b0b
Show file tree

Hide file tree

Showing 3 changed files with 46 additions and 38 deletions.
diff --git a/module_notebooks/02-building-graphs-with-pggb.ipynb b/module_notebooks/02-building-graphs-with-pggb.ipynb
@@ -322,22 +322,19 @@
    "source": [
     "## Running pggb on Chromosome VIII\n",
     "\n",
-    "Build a graph containing all the yprp assemblies using the following parameters:\n",
-    "\n",
-    "+ **-i yprp.chrVIII.fa**\n",
-    "    + an input FASTA containing all sequences\n",
-    "+ **-o output_chrVIIII**\n",
-    "    + the directory where all output files should be placed\n",
-    "+ **-n 3**\n",
-    "    + the number of haplotypes (assemblies) in the input file\n",
-    "+ **-t 20**\n",
-    "    + the number of threads to use\n",
-    "+ **-p 95**\n",
-    "    + minimum sequence identity of alignment segments\n",
-    "+ **-s 5000**\n",
-    "    + nucleotide segment length when scaffolding the graph\n",
+    "Build a graph containing all the yprp assemblies using `pggb`.\n",
+    "\n",
+    "The parameters:\n",
+    "\n",
+    "-i  input FASTA containing all sequences  \n",
+    "-o  the directory where all output files should be placed  \n",
+    "-n  the number of haplotypes (assemblies) in the input file  \n",
+    "-t  the number of threads to use  \n",
+    "-p  minimum sequence identity of alignment segments  \n",
+    "-s  5000nucleotide segment length when scaffolding the graph  \n",
     "    \n",
     "NOTE: These arguments were taken from the [pggb paper](https://github.com/pangenome/pggb-paper/blob/main/workflows/AllSpecies.md).\n",
+    "\n",
     "Refer to the paper for parameter suggestions for other species.\n",
     "\n"
    ]
@@ -348,7 +345,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!pggb build -i yprp.chrVIII.fa.gz -o output_chrVIII -n 3 -t 20 -p 95"
+    "!pggb build -i yprp.chrVIII.fa.gz -o output_chrVIII -n 3 -t 4 -p 95"
    ]
   },
   {
@@ -398,7 +395,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!partition-before-pggb -i yprp.all.fa.gz -o output_allchrs -n 3 -t 20 -p 95 -s 5000"
+    "!partition-before-pggb -i yprp.all.fa.gz -o output_allchrs -n 3 -t 4 -p 95 -s 5000"
    ]
   },
   {
@@ -473,7 +470,7 @@
     "<details>\n",
     "<summary>Click for help</summary>\n",
     "<br>\n",
-    "!pggb build -i yprp.all.fa.gz -o output_full_genome -n 3 -t 20 -p 95\n",
+    "!pggb build -i yprp.all.fa.gz -o output_full_genome -n 3 -t 4 -p 95\n",
     "\n",
     "!cp output_full_genome/yprp.all.fa.gz.*.smooth.final.gfa yprp.fullgenome.pggb.gfa\n",
     "</details>"
@@ -538,6 +535,12 @@
   }
  ],
  "metadata": {
+  "environment": {
+   "kernel": "conda-env-nigms-pangenomics-nigms-pangenomics",
+   "name": "workbench-notebooks.m127",
+   "type": "gcloud",
+   "uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-notebooks:m127"
+  },
   "kernelspec": {
    "display_name": "nigms-pangenomics",
    "language": "python",

diff --git a/module_notebooks/04-read-mapping-with-vg.ipynb b/module_notebooks/04-read-mapping-with-vg.ipynb
@@ -102,7 +102,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!fasterq-dump --outdir . --outfile SK1.illumina.fastq --threads 40 --progress ./SRR4074258/SRR4074258.sra"
+    "!fasterq-dump --outdir . --outfile SK1.illumina.fastq --threads 4 --progress ./SRR4074258/SRR4074258.sra"
    ]
   },
   {
@@ -369,6 +369,12 @@
   }
  ],
  "metadata": {
+  "environment": {
+   "kernel": "conda-env-nigms-pangenomics-nigms-pangenomics",
+   "name": "workbench-notebooks.m127",
+   "type": "gcloud",
+   "uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-notebooks:m127"
+  },
   "kernelspec": {
    "display_name": "nigms-pangenomics",
    "language": "python",

diff --git a/module_notebooks/05-variant-calling-with-vg.ipynb b/module_notebooks/05-variant-calling-with-vg.ipynb
@@ -78,24 +78,19 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!vg pack -x yprp.chrVIII.pggb.giraffe.gbz -g SK1xyprp.chrVIII.pggb.mapped.gam -Q 5 -s 5 -o yprp.chrVIII.pggb.mapped.pack -t 20"
+    "!vg pack -x yprp.chrVIII.pggb.giraffe.gbz -g SK1xyprp.chrVIII.pggb.mapped.gam -Q 5 -s 5 -o yprp.chrVIII.pggb.mapped.pack -t 4"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Generate a VCF from the read support.\n",
-    "\n",
-    "**vg call**\n",
+    "Generate a VCF from the read support using `vg call`.\n",
     "\n",
     "The parameters:\n",
     "\n",
-    "-k S288C.SK1.illumina.pack\n",
-    "+ The read support file to read in\n",
-    "\n",
-    "-t 20\n",
-    "+ Use 20 threads"
+    "-k  The read support file to read in  \n",
+    "-t  The number of threads"
    ]
   },
   {
@@ -104,7 +99,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!vg call S288C.xg -k S288C.SK1.illumina.pack -t 20 > S288C.SK1.illumina.graph_calls.vcf"
+    "!vg call S288C.xg -k S288C.SK1.illumina.pack -t 4 > S288C.SK1.illumina.graph_calls.vcf"
    ]
   },
   {
@@ -125,17 +120,15 @@
    "source": [
     "## Calling Novel Variants\n",
     "\n",
-    "Augment the graph with the mapped reads.\n",
-    "\n",
-    "**vg augment**\n",
+    "Augment the graph with the mapped reads using `vg augment`.\n",
     "\n",
     "The Parameters: XXX fix these params or delete them (We alread did this in another chapter?)\n",
     "\n",
     "-A\n",
     "+ The read alignment\n",
     "\n",
-    "-t 20\n",
-    "+ Use 20 threads\n",
+    "-t 4\n",
+    "+ Use 4 threads\n",
     "\n",
     "\n",
     "NOTE: This only supports VG files. Indexes used for mapping must be built from the same VG file being augmented (i.e. indexes built from GFA files that were then converted to VG won’t work.)\n"
@@ -147,7 +140,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!vg augment S288C.vg S288C.SK1.illumina.gam -A S288C.SK1.illumina.aug.gam -t 20 > S288C.SK1.illumina.aug.vg"
+    "!vg augment S288C.vg S288C.SK1.illumina.gam -A S288C.SK1.illumina.aug.gam -t 4 > S288C.SK1.illumina.aug.vg"
    ]
   },
   {
@@ -165,7 +158,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!vg index -x S288C.SK1.illumina.aug.xg S288C.SK1.illumina.aug.vg -t 20"
+    "!vg index -x S288C.SK1.illumina.aug.xg S288C.SK1.illumina.aug.vg -t 4"
    ]
   },
   {
@@ -181,7 +174,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!vg pack -x S288C.SK1.illumina.aug.xg -g S288C.SK1.illumina.aug.gam -Q 5 -s 5 -o S288C.SK1.illumina.aug.pack -t 20"
+    "!vg pack -x S288C.SK1.illumina.aug.xg -g S288C.SK1.illumina.aug.gam -Q 5 -s 5 -o S288C.SK1.illumina.aug.pack -t 4"
    ]
   },
   {
@@ -197,7 +190,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!vg call S288C.SK1.illumina.aug.xg -k S288C.SK1.illumina.aug.pack -t 20 > S288C.SK1.illumina.aug_calls.vcf"
+    "!vg call S288C.SK1.illumina.aug.xg -k S288C.SK1.illumina.aug.pack -t 4 > S288C.SK1.illumina.aug_calls.vcf"
    ]
   },
   {
@@ -228,7 +221,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!vg deconstruct S288C.xg -P S288C -t 20 > S288C.deconstruct.vcf"
+    "!vg deconstruct S288C.xg -P S288C -t 4 > S288C.deconstruct.vcf"
    ]
   },
   {
@@ -261,6 +254,12 @@
   }
  ],
  "metadata": {
+  "environment": {
+   "kernel": "conda-env-nigms-pangenomics-nigms-pangenomics",
+   "name": "workbench-notebooks.m127",
+   "type": "gcloud",
+   "uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-notebooks:m127"
+  },
   "kernelspec": {
    "display_name": "nigms-pangenomics",
    "language": "python",