This project is designed to train and use an Optical Character Recognition (OCR) model for recognizing characters in CAPTCHA images.
mb_capcha_ocr/
: Contains the core OCR model and prediction logic.train_model/
: Contains the training script for the OCR model.
-
Clone the repository:
git clone https://github.com/thedtvn/mbbank-capcha-ocr cd mbbank-capcha-ocr
-
Create and activate a virtual environment:
python -m venv .venv source .venv/bin/activate # On Windows use `.venv\Scripts\activate`
-
Install the required dependencies:
pip install -r requirements.txt
-
Place your training and testing images in the
dataset/
directory. The images should be named in the format{label}.(png|jpg|jpeg)
. -
Run the training script:
python train_model/train.py
-
The trained model will be saved as
model.pt
in the root directory.
-
Import the
predict
function from themb_capcha_ocr
module:from mb_capcha_ocr.predict import predict
-
Use the
predict
function to get the predicted text from an image:from PIL import Image img = Image.open("path_to_image.png") predicted_text = predict(img) print(predicted_text)
train_model/train.py
: Script to train the OCR model.mb_capcha_ocr/predict.py
: Script to predict text from an image using the trained OCR model.requirements.txt
: List of dependencies required for the project.
- Python 3.x
- torch
- torchvision
- matplotlib
- Pillow
This project is licensed under the MIT License. See the LICENSE
file for more details.