espA Mathematical Modeling

Introduction

This repository contains the codes of Jiazheng Miao's (MBI Class of 2025) Capstone Project, a mathematical modeling on the expression level of espA in Mycobacterium tuberculosis.

Pipeline

Execute the scripts according to the following order:

sbatch_download.sh: Download data listed in dna_accession.txt and rna_accession.txt
sbatch_dnaseq_pe.sh: Process paired-ended DNA-seq data
sbatch_dnaseq_se.sh: Process single-ended DNA-seq data
sbatch_rnaseq.sh: Process RNA-seq data
organize_data.py: Aggregate VCF files to a CSV file, convert RNA read counts to LogFKPM, and screen for RD8/RD236a deletions
merge_replicates.R: Combine identified variants and average the expression level across the technique replicates
pca.R: Decompose the variant matrix (espA regulatory region excluded) into PCs
modeling.Rmd: Perform the modeling

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

espA Mathematical Modeling

Introduction

Pipeline

Files

README.md

Latest commit

History

README.md

File metadata and controls

espA Mathematical Modeling

Introduction

Pipeline