Skip to content

Commit

Permalink
Reduced threads and increased consistency.
Browse files Browse the repository at this point in the history
  • Loading branch information
joannmudge committed Feb 6, 2025
1 parent 750f67e commit 5ed5b0b
Show file tree
Hide file tree
Showing 3 changed files with 46 additions and 38 deletions.
37 changes: 20 additions & 17 deletions module_notebooks/02-building-graphs-with-pggb.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -322,22 +322,19 @@
"source": [
"## Running pggb on Chromosome VIII\n",
"\n",
"Build a graph containing all the yprp assemblies using the following parameters:\n",
"\n",
"+ **-i yprp.chrVIII.fa**\n",
" + an input FASTA containing all sequences\n",
"+ **-o output_chrVIIII**\n",
" + the directory where all output files should be placed\n",
"+ **-n 3**\n",
" + the number of haplotypes (assemblies) in the input file\n",
"+ **-t 20**\n",
" + the number of threads to use\n",
"+ **-p 95**\n",
" + minimum sequence identity of alignment segments\n",
"+ **-s 5000**\n",
" + nucleotide segment length when scaffolding the graph\n",
"Build a graph containing all the yprp assemblies using `pggb`.\n",
"\n",
"The parameters:\n",
"\n",
"-i input FASTA containing all sequences \n",
"-o the directory where all output files should be placed \n",
"-n the number of haplotypes (assemblies) in the input file \n",
"-t the number of threads to use \n",
"-p minimum sequence identity of alignment segments \n",
"-s 5000nucleotide segment length when scaffolding the graph \n",
" \n",
"NOTE: These arguments were taken from the [pggb paper](https://github.com/pangenome/pggb-paper/blob/main/workflows/AllSpecies.md).\n",
"\n",
"Refer to the paper for parameter suggestions for other species.\n",
"\n"
]
Expand All @@ -348,7 +345,7 @@
"metadata": {},
"outputs": [],
"source": [
"!pggb build -i yprp.chrVIII.fa.gz -o output_chrVIII -n 3 -t 20 -p 95"
"!pggb build -i yprp.chrVIII.fa.gz -o output_chrVIII -n 3 -t 4 -p 95"
]
},
{
Expand Down Expand Up @@ -398,7 +395,7 @@
"metadata": {},
"outputs": [],
"source": [
"!partition-before-pggb -i yprp.all.fa.gz -o output_allchrs -n 3 -t 20 -p 95 -s 5000"
"!partition-before-pggb -i yprp.all.fa.gz -o output_allchrs -n 3 -t 4 -p 95 -s 5000"
]
},
{
Expand Down Expand Up @@ -473,7 +470,7 @@
"<details>\n",
"<summary>Click for help</summary>\n",
"<br>\n",
"!pggb build -i yprp.all.fa.gz -o output_full_genome -n 3 -t 20 -p 95\n",
"!pggb build -i yprp.all.fa.gz -o output_full_genome -n 3 -t 4 -p 95\n",
"\n",
"!cp output_full_genome/yprp.all.fa.gz.*.smooth.final.gfa yprp.fullgenome.pggb.gfa\n",
"</details>"
Expand Down Expand Up @@ -538,6 +535,12 @@
}
],
"metadata": {
"environment": {
"kernel": "conda-env-nigms-pangenomics-nigms-pangenomics",
"name": "workbench-notebooks.m127",
"type": "gcloud",
"uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-notebooks:m127"
},
"kernelspec": {
"display_name": "nigms-pangenomics",
"language": "python",
Expand Down
8 changes: 7 additions & 1 deletion module_notebooks/04-read-mapping-with-vg.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@
"metadata": {},
"outputs": [],
"source": [
"!fasterq-dump --outdir . --outfile SK1.illumina.fastq --threads 40 --progress ./SRR4074258/SRR4074258.sra"
"!fasterq-dump --outdir . --outfile SK1.illumina.fastq --threads 4 --progress ./SRR4074258/SRR4074258.sra"
]
},
{
Expand Down Expand Up @@ -369,6 +369,12 @@
}
],
"metadata": {
"environment": {
"kernel": "conda-env-nigms-pangenomics-nigms-pangenomics",
"name": "workbench-notebooks.m127",
"type": "gcloud",
"uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-notebooks:m127"
},
"kernelspec": {
"display_name": "nigms-pangenomics",
"language": "python",
Expand Down
39 changes: 19 additions & 20 deletions module_notebooks/05-variant-calling-with-vg.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -78,24 +78,19 @@
"metadata": {},
"outputs": [],
"source": [
"!vg pack -x yprp.chrVIII.pggb.giraffe.gbz -g SK1xyprp.chrVIII.pggb.mapped.gam -Q 5 -s 5 -o yprp.chrVIII.pggb.mapped.pack -t 20"
"!vg pack -x yprp.chrVIII.pggb.giraffe.gbz -g SK1xyprp.chrVIII.pggb.mapped.gam -Q 5 -s 5 -o yprp.chrVIII.pggb.mapped.pack -t 4"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Generate a VCF from the read support.\n",
"\n",
"**vg call**\n",
"Generate a VCF from the read support using `vg call`.\n",
"\n",
"The parameters:\n",
"\n",
"-k S288C.SK1.illumina.pack\n",
"+ The read support file to read in\n",
"\n",
"-t 20\n",
"+ Use 20 threads"
"-k The read support file to read in \n",
"-t The number of threads"
]
},
{
Expand All @@ -104,7 +99,7 @@
"metadata": {},
"outputs": [],
"source": [
"!vg call S288C.xg -k S288C.SK1.illumina.pack -t 20 > S288C.SK1.illumina.graph_calls.vcf"
"!vg call S288C.xg -k S288C.SK1.illumina.pack -t 4 > S288C.SK1.illumina.graph_calls.vcf"
]
},
{
Expand All @@ -125,17 +120,15 @@
"source": [
"## Calling Novel Variants\n",
"\n",
"Augment the graph with the mapped reads.\n",
"\n",
"**vg augment**\n",
"Augment the graph with the mapped reads using `vg augment`.\n",
"\n",
"The Parameters: XXX fix these params or delete them (We alread did this in another chapter?)\n",
"\n",
"-A\n",
"+ The read alignment\n",
"\n",
"-t 20\n",
"+ Use 20 threads\n",
"-t 4\n",
"+ Use 4 threads\n",
"\n",
"\n",
"NOTE: This only supports VG files. Indexes used for mapping must be built from the same VG file being augmented (i.e. indexes built from GFA files that were then converted to VG won’t work.)\n"
Expand All @@ -147,7 +140,7 @@
"metadata": {},
"outputs": [],
"source": [
"!vg augment S288C.vg S288C.SK1.illumina.gam -A S288C.SK1.illumina.aug.gam -t 20 > S288C.SK1.illumina.aug.vg"
"!vg augment S288C.vg S288C.SK1.illumina.gam -A S288C.SK1.illumina.aug.gam -t 4 > S288C.SK1.illumina.aug.vg"
]
},
{
Expand All @@ -165,7 +158,7 @@
"metadata": {},
"outputs": [],
"source": [
"!vg index -x S288C.SK1.illumina.aug.xg S288C.SK1.illumina.aug.vg -t 20"
"!vg index -x S288C.SK1.illumina.aug.xg S288C.SK1.illumina.aug.vg -t 4"
]
},
{
Expand All @@ -181,7 +174,7 @@
"metadata": {},
"outputs": [],
"source": [
"!vg pack -x S288C.SK1.illumina.aug.xg -g S288C.SK1.illumina.aug.gam -Q 5 -s 5 -o S288C.SK1.illumina.aug.pack -t 20"
"!vg pack -x S288C.SK1.illumina.aug.xg -g S288C.SK1.illumina.aug.gam -Q 5 -s 5 -o S288C.SK1.illumina.aug.pack -t 4"
]
},
{
Expand All @@ -197,7 +190,7 @@
"metadata": {},
"outputs": [],
"source": [
"!vg call S288C.SK1.illumina.aug.xg -k S288C.SK1.illumina.aug.pack -t 20 > S288C.SK1.illumina.aug_calls.vcf"
"!vg call S288C.SK1.illumina.aug.xg -k S288C.SK1.illumina.aug.pack -t 4 > S288C.SK1.illumina.aug_calls.vcf"
]
},
{
Expand Down Expand Up @@ -228,7 +221,7 @@
"metadata": {},
"outputs": [],
"source": [
"!vg deconstruct S288C.xg -P S288C -t 20 > S288C.deconstruct.vcf"
"!vg deconstruct S288C.xg -P S288C -t 4 > S288C.deconstruct.vcf"
]
},
{
Expand Down Expand Up @@ -261,6 +254,12 @@
}
],
"metadata": {
"environment": {
"kernel": "conda-env-nigms-pangenomics-nigms-pangenomics",
"name": "workbench-notebooks.m127",
"type": "gcloud",
"uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-notebooks:m127"
},
"kernelspec": {
"display_name": "nigms-pangenomics",
"language": "python",
Expand Down

0 comments on commit 5ed5b0b

Please sign in to comment.