This repo contains scripts and notebooks that document ablation studies that we performed on CLIPNET. These studies were performed as part of our investigation into the extent to which training on matched genomic sequences and molecular profiles improves molecular QTL prediction, which we describe in this preprint.
The models & data generated by these analyses are deposited on Zenodo.
contains many utility scripts used to process data for training the ablated CLIPNET models.
contains metadata info and config files that are used in some of the data processing scripts.
describes the conda/environment used for the data processing pipelines.
contains snakemake pipelines to download and process data for the ablated model training.
contains example scripts for training the ablated models.
contains notebooks (& instructions) for plotting the accuracy of the ablated models at predicting PRO-cap signal across genomic loci.
contains notebooks (& instructions) for plotting the accuracy of the ablated models at predicting initiation QTL effects.
contains a bunch of notebooks used for plotting predictions/attributions at selected example loci. These were not used in the paper and are rather preliminary.