After setting up safe-rlhf structure in safe-rlhf:
bash scripts/sft-my.sh
git clone https://github.com/GAIR-NLP/abel.git
And after setting up env of abel;
cd abel
bash evaluation/eval.sh
cd ..
python score.py --base_line abel/outputs/ra_outputs/math/70b.jsonl --pred_file abel/outputs/onlynum_outputs/math/70b.jsonl