This repository contains the code to a university project on interpretability measures in text classification. It investigates and compares interpretability measures for Support Vector Machine and transformer-based (DistilBERT) models for text classification on the 20-newsgroups corpus.
- data set preprocessing (excluding headers, footers, quotes and truncating) is done in
dataset/dataset.py
- tune model and vectorizer parameters:
svm-parameter-tuning.py
- train/test model and generate coefficient outputs:
svm.py
You can either fine-tune the model yourself, or download our finetuned model and compute attributions.
- fine-tune the model yourself:
models/model.py
- download the fine-tuned model from Google Drive and save it to the
models/
directory; you may need to adjust the file path incaptum-explain.py
to run the attribution computations with the captum pipeline inexplainer.py
- analyse the attributions in
outputs/distilbert/
Analyses are conducted in 3 different Notebooks:
analysis-distilbert.ipynb
contains the analyses of the DistilBERT attributionsanalysis-predictions.ipynb
compares the scores for the test set instances (SVM coefficients inoutputs/coefs_test.csv
and DistilBERT attributions inoutputs/distilbert_attributions.csv
), creates some visualizations of specific instances inoutputs/viz
analysis-vocabs.ipynb
conpares the scores over the general vocabulary (SVM coefficients inoutputs/vocab_coef_svm.csv
and DistilBERT attributions inoutputs/vocab_attr_dist4_gold.csv
andoutputs/vocab_attr_dist4_pred.csv
Lydia Körber and Lisanne Rüh