Name		Name	Last commit message	Last commit date
parent directory ..
notebooks		notebooks
reference_sequences		reference_sequences
results		results
scripts		scripts
.gitignore		.gitignore
README.md		README.md
Snakefile		Snakefile
config.yaml		config.yaml
run.sbatch		run.sbatch

README.md

Mutant library design

This subdirectory contains scripts and notebooks that generate primer pools used for Delta variant spike deep mutational scanning experiments. The design is run by Snakefile, and puts results in ./results/. Code generates four separate primer pools:

./results/gisaid_primer_oPool.csv spreadsheet contains specific forward and reverse primer pools for all mutations present in GISAID spike sequences.
./results/usher_primer_oPool.csv spreadsheet contains specific forward and reverse primers for each independently reoccurring mutation on SARS-CoV-2 phylogenetic tree.
./results/positiveSel_primer_oPool.csv spreadsheet contains random forward and reverse primers for each spike site that's undergoing positive selection.
./results/paired_positiveSel_primer_oPool.csv spreadsheet contains forward and reverse primers for generating spike mutants that have paired mutations for closely occurring spike residues, which are undergoing positive selection.

The final primer sequences used as ordering sheets for oPools on IDTdna are .csv files ending with oPool in ./results/.

The file results/aggregated_mutations.csv indicates all mutations that are designed in each category.

Input data

GISAID_data/spikeprot0724.fasta contains an alignment of all spike proteins as downloaded from the Download tab of the EpiCov section of GISAID on July-26-2021. Note that the download yields a zipped .tar file; this file was then un-tarred and unzipped. Due to GISAID data sharing terms, this file is not actually included in the repo.
./reference_sequences subdirectory contains SARS-CoV-2 spike reference sequences and lookup tables required to renumber positions between variants.

Scripts

scripts subdirectory contains scripts for generating data. Scripts work as follows:
- spike_positive_selection_sites.py script uses SARS-CoV-2 spike protein selection data (as described in this paper) to filter for positions in spike that are undergoing positive selection.
- filter_and_align_gisaid.py uses GISAID_data/spikeprot0724.fasta sequences to align all SARS-CoV-2 spike sequences deposited in GISAID as of July-26-2021
- spike_alignment_counts.py extracts all mutations in GISAID spike alignments relative too Wuhan-1 sequence.
- spike_mutcounts.py counts the number of independently reoccurring mutations on SARS-CoV-2 phylogenetic tree available from UShER.
- 2021Jan_create_primers.py and create_primers_del.py are scripts that create random or specific amino acid change primers, respectively.

Notebooks

./notebooks subdirectory contains notebooks used to generate primer pools found in ./results/primers.
- gisaid_variant_primers.py.ipynb notebook generates specific amino acid primers for each mutation present in GISAID data.
- usher_primers.py.ipynb notebook generates specific amino acid primers for each independently reoccurring mutation on SARS-CoV-2 phylogenetic tree.
- positive_selection_primers.py.ipynb notebook generates NNG/NNC primer pools for each position on spike that is undergoing positive selection.
- paired_positive_selection_primers.py.ipynb notebook generates pools of NNG/NNC primers that introduce paired mutations for closely located sites that are undergoing positive selection
- oPool_primer_sheets.py.ipynb takes primer pools generated by the notebooks above and formats spreadsheets in accordance to IDTdna oPool order input format.

Lab notebook

Bernadeta's lab notebook that includes all experiments done on this project can be found here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

library_design

library_design

README.md

Mutant library design

Input data

Scripts

Notebooks

Lab notebook

Files

library_design

Directory actions

More options

Directory actions

More options

Latest commit

History

library_design

Folders and files

parent directory

README.md

Mutant library design

Input data

Scripts

Notebooks

Lab notebook