Skip to content

Training data generator for Text Detection and Text Recognition

Notifications You must be signed in to change notification settings

xReniar/OCRD-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCR dataset generator

Training data generator for Text Detection and Text Recognition. The training data will be generated following the format specified by the various supported OCR systems. The supposted OCR systems are:

At the moment the datasets that can be used to generate the training data are IAM, SROIE, FUNSD

How to use

The main.py needs a json file where all the configuration for the training data are specified.

{
    "name": "output_folder_name",
    "task": "training_data_task",
    "datasets": [
        "dataset_1",
        "..."
    ]
}

To start the generation process just run:

python3 main.py --config config/config.json

Adding new datasets

About

Training data generator for Text Detection and Text Recognition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages