Skip to content

Commit

Permalink
Add ModernBERT config (#119)
Browse files Browse the repository at this point in the history
Adds a ModernBERT config for the original toxic comment classification challenge, using the ModernBERT-base model.

ModernBERT is an architecture similar to BERT leveraging more recent techniques like RoPE for long context and flash attention for faster inference. It is also trained on a variety of mainly English sources, including scientific articles and code.

Currently requires installing transformers from git to train.
  • Loading branch information
jamt9000 authored Jan 4, 2025
1 parent c77352b commit 03ace8a
Showing 1 changed file with 40 additions and 0 deletions.
40 changes: 40 additions & 0 deletions configs/Toxic_comment_classification_ModernBERT.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
{
"name": "Jigsaw_ModernBERT",
"n_gpu": 1,
"batch_size": 10,
"accumulate_grad_batches": 3,
"loss": "binary_cross_entropy",
"arch": {
"type": "ModernBERT",
"args": {
"num_classes": 6,
"model_type": "answerdotai/ModernBERT-base",
"model_name": "ModernBertForSequenceClassification",
"tokenizer_name": "AutoTokenizer"
}
},
"dataset": {
"type": "JigsawDataOriginal",
"args": {
"train_csv_file": "jigsaw_data/jigsaw-toxic-comment-classification-challenge/train.csv",
"test_csv_file": "jigsaw_data/jigsaw-toxic-comment-classification-challenge/val.csv",
"add_test_labels": false,
"classes": [
"toxicity",
"severe_toxicity",
"obscene",
"threat",
"insult",
"identity_attack"
]
}
},
"optimizer": {
"type": "Adam",
"args": {
"lr": 3e-5,
"weight_decay": 3e-6,
"amsgrad": true
}
}
}

0 comments on commit 03ace8a

Please sign in to comment.