Training script for the Bert-based model on the NLI dataset #85

Hominnn · 2024-06-28T03:17:33Z

Dear author, I want to use bert-base-uncased model to train on NLI dataset based on your method for some research. Could you provide relevant training scripts so that I can better reproduce your experimental results? This is my training script, using the same data as your training. I cannot reproduce the evaluation effect of your angle-bert-base-uncased-nli-en-v1 model.

CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 --master_port=1234 train_nli.py \
--task NLI-STS --output_dir ckpts/NLI-STS-bert-cls \
--model_name_or_path ../models/bert-base-uncased \
--learning_rate 5e-5 --maxlen 50 \
--epochs 1 \
--batch_size 10 \
--logging_steps 500 \
--warmup_steps 0 \
--save_steps 1000 --seed 42 --do_eval 0 --gradient_accumulation_steps 4 --fp16 1 --torch_dtype 'float32' \
--pooling_strategy 'cls'

This is my evalution result on STS

SeanLee97 · 2024-06-28T09:45:44Z

hello @Hominnn, the training code train_nli.py is too old. It is recommended to use angle-trainer now.

I've updated the NLI document: https://github.com/SeanLee97/AnglE/blob/main/examples/NLI/README.md#41-bert
You can find the new training script in the document.

To run it successfully,

please upgrade the angle-emb to the latest version via python -m pip install -U angle-emb
please use the latest evaluation code: https://github.com/SeanLee97/AnglE/blob/main/examples/NLI/eval_nli.py
if you want to push your model to huggingface, please set --push_to_hub 1 and specify a model id in your space via --hub_model_id xxx. If not, set --push_to_hub 0.

Here are the intermediate results (in about 9 epochs) of my run:

+-------+-------+-------+-------+-------+--------------+-----------------+-------+
| STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness |  Avg. |
+-------+-------+-------+-------+-------+--------------+-----------------+-------+
| 75.59 | 84.83 | 80.37 | 86.26 | 81.96 |    85.12     |      80.70      | 82.12 |
+-------+-------+-------+-------+-------+--------------+-----------------+-------+

You can try to increase the epoch, ibn_w, or gradient_accumulation_steps for better results.

I am still training several models with different hyperparameters; I will let you know the better hyperparameters when they are done.

Hominnn · 2024-06-28T10:10:22Z

hello @Hominnn, the training code train_nli.py is too old. It is recommended to use angle-trainer now.

I've updated the NLI document: https://github.com/SeanLee97/AnglE/blob/main/examples/NLI/README.md#41-bert You can find the new training script in the document.

To run it successfully,

please upgrade the angle-emb to the latest version via python -m pip install -U angle-emb

please use the latest evaluation code: https://github.com/SeanLee97/AnglE/blob/main/examples/NLI/eval_nli.py

if you want to push your model to huggingface, please set --push_to_hub 1 and specify a model id in your space via --hub_model_id xxx. If not, set --push_to_hub 0.

Here are the intermediate results (in about 9 epochs) of my run:
+-------+-------+-------+-------+-------+--------------+-----------------+-------+
| STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness |  Avg. |
+-------+-------+-------+-------+-------+--------------+-----------------+-------+
| 75.59 | 84.83 | 80.37 | 86.26 | 81.96 |    85.12     |      80.70      | 82.12 |
+-------+-------+-------+-------+-------+--------------+-----------------+-------+
You can try to increase the epoch, ibn_w, or gradient_accumulation_steps for better results.

I am still training several models with different hyperparameters; I will let you know the better hyperparameters when they are done.

Thank you for your serious reply. Looking forwarding to more of your meaningful work！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training script for the Bert-based model on the NLI dataset #85

Training script for the Bert-based model on the NLI dataset #85

Hominnn commented Jun 28, 2024 •

edited

Loading

SeanLee97 commented Jun 28, 2024

Hominnn commented Jun 28, 2024

Training script for the Bert-based model on the NLI dataset #85

Training script for the Bert-based model on the NLI dataset #85

Comments

Hominnn commented Jun 28, 2024 • edited Loading

SeanLee97 commented Jun 28, 2024

Hominnn commented Jun 28, 2024

Hominnn commented Jun 28, 2024 •

edited

Loading