Study by Bernadeta Dadonaite and Jesse Bloom. See Haddox et al (2023) for citation.
For documentation of the analysis, see https://dms-vep.github.io/SARS-CoV-2_Omicron_BA.2_spike_DMS/.
Most of the analysis is done by the dms-vep-pipeline, which was added as a git submodule to this pipeline via:
git submodule add https://github.com/dms-vep/dms-vep-pipeline
This added the file .gitmodules and the submodule dms-vep-pipeline, which was then committed to the repo. Note that if you want a specific commit or tag of dms-vep-pipeline or to update to a new commit, follow the steps here, basically:
cd dms-vep-pipeline
git checkout <commit>
and then cd ../
back to the top-level directory, and add and commit the updated dms-vep-pipeline
submodule.
You can also make changes to the dms-vep-pipeline that you commit back to that repo.
The snakemake pipeline itself is run by the Snakefile, which includes dms-vep-pipeline reads its configuration from config.yaml.
The conda environment used by the pipeline is that specified in the environment.yml
file in dms-vep-pipeline.
Additional scripts and notebooks that are specific to this analysis and not part of dms-vep-pipeline are in ./scripts/ and ./notebooks/.
Input data for the pipeline are in ./data/.
The results of running the pipeline are placed in ./results/. Only some of these results are tracked to save space (see .gitignore).
The pipeline builds HTML documentation for the pipeline in ./docs/, which is rendered via GitHub Pages at https://dms-vep.github.io/SARS-CoV-2_Omicron_BA.2_spike_DMS/.
The design of the mutant library is contained in ./library_design/. That design is not part of the pipeline but contains code that must be run separately with its own conda environment.
To run the pipeline, build the conda environment dms-vep-pipeline
in the environment.yml
file of dms-vep-pipeline, activate it, and run snakemake, such as:
conda activate dms-vep-pipeline
snakemake -j 32 --use-conda --rerun-incomplete
To run on the Hutch cluster via slurm, you can run the file run_Hutch_cluster.bash:
sbatch -c 32 run_Hutch_cluster.bash