This subdirectory contains scripts and notebooks that generate primer pools used for Omicron_BA.2 variant spike deep mutational scanning experiments. The design is run by Snakefile, and puts results in ./results/. Code generates four separate primer pools:
- ./results/gisaid_primer_oPool.csv spreadsheet contains specific forward and reverse primer pools for all mutations present in GISAID spike sequences.
- ./results/usher_primer_oPool.csv spreadsheet contains specific forward and reverse primers for each independently reoccurring mutation on SARS-CoV-2 phylogenetic tree.
- ./results/positiveSel_primer_oPool.csv spreadsheet contains random forward and reverse primers for each spike site that's undergoing positive selection.
- ./results/paired_positiveSel_primer_oPool.csv spreadsheet contains forward and reverse primers for generating spike mutants that have paired mutations for closely occurring spike residues, which are undergoing positive selection.
The final primer sequences used as ordering sheets for oPools on IDTdna
are .csv
files ending with oPool
in ./results/.
The file results/aggregated_mutations.csv indicates all mutations that are designed in each category.
- ./reference_sequences subdirectory contains SARS-CoV-2 spike reference sequences and lookup tables required to renumber positions between variants.
- scripts subdirectory contains scripts for generating data. Scripts work as follows:
spike_positive_selection_sites.py
script uses SARS-CoV-2 spike protein selection data (as described in this paper) to filter for positions in spike that are undergoing positive selection.spike_alignment_counts.py
extracts all mutations in GISAID spike alignments relative too Wuhan-1 sequence using data from CoVsurver.spike_mutcounts.py
counts the number of independently reoccurring mutations on SARS-CoV-2 phylogenetic tree available from UShER.2021Jan_create_primers.py
andcreate_primers_del.py
are scripts that create random or specific amino acid change primers, respectively.
- ./notebooks subdirectory contains notebooks used to generate primer pools found in ./results/primers.
gisaid_variant_primers.py.ipynb
notebook generates specific amino acid primers for each mutation present in GISAID data.usher_primers.py.ipynb
notebook generates specific amino acid primers for each independently reoccurring mutation on SARS-CoV-2 phylogenetic tree.positive_selection_primers.py.ipynb
notebook generates NNG/NNC primer pools for each position on spike that is undergoing positive selection.paired_positive_selection_primers.py.ipynb
notebook generates pools of NNG/NNC primers that introduce paired mutations for closely located sites that are undergoing positive selectionoPool_primer_sheets.py.ipynb
takes primer pools generated by the notebooks above and formats spreadsheets in accordance toIDTdna
oPool order input format.