Model created by Demid Efremov as part of PMLDL course
The model aims to reudce the toxicity of texts by paraphrasing sentences using finetuned t5-small language model.
This projects uses HuggingFace transformer library as the method of running text2text models.
Requirements can be installed with:
$ pip install -r requirements.txt
- Install the requirements
- Download the model here, and unzip it into the models folder.
- Run the inference script with
$ python ./src/models/predict_model.py
Command line parameters can be seen with
$ python ./src/models/predict_model.py -h
- Install the requirements
- Run the dataset downloader with
$ python ./src/data/download_dataset.py
- Run the dataset splitter with
$ python ./src/data/split_dataset.py
- Run the training with
$ python ./src/models/train_model.py
Use -h flag with any script to see configurable parameters. Check out the notebooks for extra info.