Skip to content

Code for 4th year honours thesis project at the University of Waterloo.

License

Notifications You must be signed in to change notification settings

alex137911/BME499

Repository files navigation

Investigating the prevalence of antibiotic-resistant Clostridiodes difficile in human gut metagenomes

Metagenome study to assess the global prevalence of antibiotic-resistant Clostridioides difficile – a major contributor to hospital-acquired infections – in human gut metagenomes. Developed a single nucleotide polymorphism-based detection pipeline for antibiotic resistance in C. difficile.

Pipeline Overview

  1. download_sra.sh: Downloads and processes SRA files containing hits for both "C. difficile and "human gut metagenome" (i.e., NCBI Taxonomy ID 408170), using accession IDs from a .CSV file. Converts SRA data to FASTQ format using fasterq-dump for each ID listed.

  2. fastp_qc.sh: Performs quality control and trimming on FASTQ files using fastp for each SRA accession ID. Generates HTML and JSON reports for each ID’s quality control.

  3. bowtie2_alignment.sh: Aligns trimmed FASTQ files to a reference genome (FN545816.1) using Bowtie2 and generates BAM files for each SRA accession ID.

  4. extracted_alignment.sh: Processes BAM files to analyze SNPs in specific genes for each SRA accession ID. Verifies the existence and completeness of BAM files, skipping those with valid output. Sorts and indexes BAM files, then calls variants using bcftools. Checks coverage at key mutation sites (within genes gyrA, rpoB, nimB, vanR) and extracts variants if covered. Writes mutation and coverage data to a CSV file, logging any missing or corrupted BAM files.

  5. variantProcessing.py: Processes variant data from C. difficile samples.

About

Code for 4th year honours thesis project at the University of Waterloo.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published