Skip to content

Latest commit

 

History

History

library_design

Mutant library design

This subdirectory contains scripts and notebooks that generate primer pools used for Omicron_BA.2 variant spike deep mutational scanning experiments. The design is run by Snakefile, and puts results in ./results/. Code generates four separate primer pools:

The final primer sequences used as ordering sheets for oPools on IDTdna are .csv files ending with oPool in ./results/.

The file results/aggregated_mutations.csv indicates all mutations that are designed in each category.

Input data

  • ./reference_sequences subdirectory contains SARS-CoV-2 spike reference sequences and lookup tables required to renumber positions between variants.

Scripts

  • scripts subdirectory contains scripts for generating data. Scripts work as follows:
    • spike_positive_selection_sites.py script uses SARS-CoV-2 spike protein selection data (as described in this paper) to filter for positions in spike that are undergoing positive selection.
    • spike_alignment_counts.py extracts all mutations in GISAID spike alignments relative too Wuhan-1 sequence using data from CoVsurver.
    • spike_mutcounts.py counts the number of independently reoccurring mutations on SARS-CoV-2 phylogenetic tree available from UShER.
    • 2021Jan_create_primers.py and create_primers_del.py are scripts that create random or specific amino acid change primers, respectively.

Notebooks

  • ./notebooks subdirectory contains notebooks used to generate primer pools found in ./results/primers.
    • gisaid_variant_primers.py.ipynb notebook generates specific amino acid primers for each mutation present in GISAID data.
    • usher_primers.py.ipynb notebook generates specific amino acid primers for each independently reoccurring mutation on SARS-CoV-2 phylogenetic tree.
    • positive_selection_primers.py.ipynb notebook generates NNG/NNC primer pools for each position on spike that is undergoing positive selection.
    • paired_positive_selection_primers.py.ipynb notebook generates pools of NNG/NNC primers that introduce paired mutations for closely located sites that are undergoing positive selection
    • oPool_primer_sheets.py.ipynb takes primer pools generated by the notebooks above and formats spreadsheets in accordance to IDTdna oPool order input format.