This example quantizes the microsoft/codebert-base fine-tuned on the the code defect detection task.
pip install neural-compressor
pip install -r requirements.txt
Note: Validated ONNX Runtime Version.
Run prepare_data.sh
script to download dataset from website to dataset
folder and pre-process it:
bash prepare_data.sh
Fine-tuning the model on code defect detection task.
bash run_fine_tuning.sh --train_dataset_location=./dataset/train.jsonl --dataset_location=./dataset/valid.jsonl --fine_tune
Export model to ONNX format.
# By default, the input model path is `checkpoint-best-acc/`.
python prepare_model.py --input_model=./checkpoint-best-acc --output_model=./codebert-exported-onnx
Static quantization with QOperator format:
bash run_quant.sh --input_model=/path/to/model \ # model path as *.onnx
--output_model=/path/to/model_tune \
--dataset_location=path/to/glue/data
bash run_benchmark.sh --input_model=path/to/model \ # model path as *.onnx
--dataset_location=path/to/glue/data \
--batch_size=batch_size \
--mode=performance # or accuracy