nn-interpretability

This repository contains the code to a university project on interpretability measures in text classification. It investigates and compares interpretability measures for Support Vector Machine and transformer-based (DistilBERT) models for text classification on the 20-newsgroups corpus.

Usage

Dataset

data set preprocessing (excluding headers, footers, quotes and truncating) is done in dataset/dataset.py

SVM

tune model and vectorizer parameters: svm-parameter-tuning.py
train/test model and generate coefficient outputs: svm.py

DistilBERT

You can either fine-tune the model yourself, or download our finetuned model and compute attributions.

fine-tune the model yourself: models/model.py
download the fine-tuned model from Google Drive and save it to the models/ directory; you may need to adjust the file path in captum-explain.py to run the attribution computations with the captum pipeline in explainer.py
analyse the attributions in outputs/distilbert/

Analyses

Analyses are conducted in 3 different Notebooks:

analysis-distilbert.ipynb contains the analyses of the DistilBERT attributions
analysis-predictions.ipynb compares the scores for the test set instances (SVM coefficients in outputs/coefs_test.csv and DistilBERT attributions in outputs/distilbert_attributions.csv), creates some visualizations of specific instances in outputs/viz
analysis-vocabs.ipynb conpares the scores over the general vocabulary (SVM coefficients in outputs/vocab_coef_svm.csv and DistilBERT attributions in outputs/vocab_attr_dist4_gold.csv and outputs/vocab_attr_dist4_pred.csv

Authors

Lydia Körber and Lisanne Rüh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nn-interpretability

Usage

Dataset

SVM

DistilBERT

Analyses

Authors

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
dataset		dataset
models		models
outputs		outputs
.gitignore		.gitignore
README.md		README.md
_stop_words.py		_stop_words.py
analysis-distilbert.ipynb		analysis-distilbert.ipynb
analysis-predictions.ipynb		analysis-predictions.ipynb
analysis-vocabs.ipynb		analysis-vocabs.ipynb
captum-explain.py		captum-explain.py
explainer.py		explainer.py
requirements.txt		requirements.txt
svm-parameter-tuning.py		svm-parameter-tuning.py
svm.py		svm.py

LydiaKoerber/nn-interpretability

Folders and files

Latest commit

History

Repository files navigation

nn-interpretability

Usage

Dataset

SVM

DistilBERT

Analyses

Authors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages