Training German Print

Training of kraken model for German print

german_print is a model which was trained from 2023-05-15 until 2023-05-16 for the recognition of printed (German and Latin) texts. The goal was to get a generic model which can be used with prints of all epochs, ranging from 15th incunabula until 20th century prints.

The latest model file german_print.mlmodel for Kraken OCR is available from Zenodo:
Weil, S., Kamlah, J., & Schmidt, T. (2023). OCR model for German prints trained from several datasets. Zenodo. https://doi.org/10.5281/zenodo.10519596

Data from all training processes including intermediate results can be found at https://ub-backup.bib.uni-mannheim.de/~stweil/tesstrain/kraken/german_print/.

Ground Truth

The training used four ground truth data sets with 676 PAGE XML files / 187482 text lines. That data sets include texts from 15th to 20th century with a large focus on newspapers from the 19th century.

Preparing data for training

Most data sets only include the ground truth texts and rules how to get the images. So the missing images must be downloaded before the training can start.

Training with large dataset

Prepare shuffled list of all PAGE XML files:

find ~/src/github/UB-Mannheim/AustrianNewspapers/data/{TrainingSet_ONB_Newseye_GT_M1+,ValidationSet_ONB_Newseye_GT_M1+}/GT-PAGE -name "*.xml" > liste
find ~/src/github/UB-Mannheim/digi-gt/PPN* -name "*.xml" >> liste
find ~/src/github/UB-Mannheim/digitue-gt/{Theo,Tue,VD18} -name "*.xml" >> liste
find ~/src/github/UB-Mannheim/reichsanzeiger-gt/data/reichsanzeiger-1820-1939/GT-PAGE -name "*.xml" >> liste

shuf liste  > liste.shuf

Compile the training data:

time ketos compile --format-type xml --files liste.shuf --workers 36 -o /data/stweil/german_print2.arrow

[05/15/23 08:25:11] WARNING  Could not open file /home/stweil/src/github/UB-Mannheim/digitue-gt/Tue/JfI155/2_8783e_default.jpg in                                                                 arrow_dataset.py:172
                             /home/stweil/src/github/UB-Mannheim/digitue-gt/Tue/JfI155/2_8783e_default.xml                                                                                                            
[05/15/23 08:25:12] WARNING  Could not open file /home/stweil/src/github/UB-Mannheim/digi-gt/PPN477380670/477380670_0114.jpg in                                                                   arrow_dataset.py:172
                             /home/stweil/src/github/UB-Mannheim/digi-gt/PPN477380670/477380670_0114.xml                                                                                                              
[05/15/23 08:25:16] WARNING  Could not open file /home/stweil/src/github/UB-Mannheim/digitue-gt/Tue/JfI155/5_ca222_default.jpg in                                                                 arrow_dataset.py:172
                             /home/stweil/src/github/UB-Mannheim/digitue-gt/Tue/JfI155/5_ca222_default.xml                                                                                                            
[05/15/23 08:25:17] WARNING  Could not open file /home/stweil/src/github/UB-Mannheim/digi-gt/PPN477380670/477380670_0115.jpg in                                                                   arrow_dataset.py:172
                             /home/stweil/src/github/UB-Mannheim/digi-gt/PPN477380670/477380670_0115.xml                                                                                                              
[05/15/23 08:25:20] WARNING  Could not open file /home/stweil/src/github/UB-Mannheim/digitue-gt/Tue/JfI155/4_cd6fb_default.jpg in                                                                 arrow_dataset.py:172
                             /home/stweil/src/github/UB-Mannheim/digitue-gt/Tue/JfI155/4_cd6fb_default.xml                                                                                                            
[05/15/23 08:26:33] WARNING  Invalid line 30 in /data/stweil/src/github/UB-Mannheim/digitue-gt/Theo/Die_paepstliche_Unfehlbarkeit/2_7cd63_default.jpg: Line polygon outside of image bounds        arrow_dataset.py:57
[05/15/23 08:30:48] WARNING  Invalid line 41 in /data/stweil/src/github/UB-Mannheim/digitue-gt/Theo/Die_paepstliche_Unfehlbarkeit/4_b74f9_default.jpg: Line polygon outside of image bounds        arrow_dataset.py:57
Extracting lines ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 192229/192274 -:--:-- -:--:--
Output file written to /data/stweil/german_print2.arrow

real    14m27,011s
user    377m0,912s
sys     83m58,846s

Start the training:

ketos train -d cuda:0 -f binary -o german_print -r 0.003 --precision 16 --batch-size 9 /data/stweil/german_print2.arrow -s '[1,120,0,1 Cr3,13,32 Do0.1,2 Mp2,2 Cr3,13,32 Do0.1,2 Mp2,2 Cr3,9,64 Do0.1,2 Mp2,2 Cr3,9,64 Do0.1,2 S1(1x0)1,3 Lbx200 Do0.1,2 Lbx200 Do.1,2 Lbx200 Do]'

scikit-learn version 1.2.2 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
You are using a CUDA device ('NVIDIA RTX A5000') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
┏━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃    ┃ Name      ┃ Type                     ┃ Params ┃                 In sizes ┃                Out sizes ┃
┡━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 0  │ val_cer   │ CharErrorRate            │      0 │                        ? │                        ? │
│ 1  │ val_wer   │ WordErrorRate            │      0 │                        ? │                        ? │
│ 2  │ net       │ MultiParamSequential     │  4.1 M │  [[1, 1, 120, 400], '?'] │   [[1, 284, 1, 50], '?'] │
│ 3  │ net.C_0   │ ActConv2D                │  1.3 K │  [[1, 1, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 4  │ net.Do_1  │ Dropout                  │      0 │ [[1, 32, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 5  │ net.Mp_2  │ MaxPool                  │      0 │ [[1, 32, 120, 400], '?'] │  [[1, 32, 60, 200], '?'] │
│ 6  │ net.C_3   │ ActConv2D                │ 40.0 K │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 7  │ net.Do_4  │ Dropout                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 8  │ net.Mp_5  │ MaxPool                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 30, 100], '?'] │
│ 9  │ net.C_6   │ ActConv2D                │ 55.4 K │  [[1, 32, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 10 │ net.Do_7  │ Dropout                  │      0 │  [[1, 64, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 11 │ net.Mp_8  │ MaxPool                  │      0 │  [[1, 64, 30, 100], '?'] │   [[1, 64, 15, 50], '?'] │
│ 12 │ net.C_9   │ ActConv2D                │  110 K │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 13 │ net.Do_10 │ Dropout                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 14 │ net.S_11  │ Reshape                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 960, 1, 50], '?'] │
│ 15 │ net.L_12  │ TransposedSummarizingRNN │  1.9 M │   [[1, 960, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 16 │ net.Do_13 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 17 │ net.L_14  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 18 │ net.Do_15 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 19 │ net.L_16  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 20 │ net.Do_17 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 21 │ net.O_18  │ LinSoftmax               │  113 K │   [[1, 400, 1, 50], '?'] │   [[1, 284, 1, 50], '?'] │
└────┴───────────┴──────────────────────────┴────────┴──────────────────────────┴──────────────────────────┘
Trainable params: 4.1 M                                                                                                                                                                                               
Non-trainable params: 0                                                                                                                                                                                               
Total params: 4.1 M                                                                                                                                                                                                   
Total estimated model params size (MB): 16                                                                                                                                                                            
stage 0/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19223/19223 0:31:36 • 0:00:00 9.95it/s val_accuracy: 0.0 val_word_accuracy: 0.0  early_stopping: 0/10 0.00000
stage 1/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19223/19223 0:31:30 • 0:00:00 9.79it/s val_accuracy: 0.0 val_word_accuracy: 0.0  early_stopping: 1/10 0.00000
stage 2/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19223/19223 0:31:49 • 0:00:00 9.67it/s  val_accuracy: 0.0 val_word_accuracy: 0.0  early_stopping: 1/10 0.00000
Validation ━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━ 1458/2136   0:00:52 • 0:00:25 27.65it/s                                           early_stopping: 1/10 0.00000

(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Druckschriften$ time ketos compile --format-type xml --files austriannewspapers.shuf --workers 36 -o /data/stweil/austriannewspapers.arrow
Extracting lines ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 59851/59851 -:--:-- -:--:--
Output file written to /data/stweil/austriannewspapers.arrow

real    2m29,184s
user    63m19,140s
sys     14m7,286s
(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Druckschriften$ time ketos compile --format-type xml --files reichsanzeiger-gt.shuf --workers 36 -o /data/stweil/reichsanzeiger.arrow
Extracting lines ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 119429/119429 -:--:-- -:--:--
Output file written to /data/stweil/reichsanzeiger.arrow

real    11m31,541s
user    253m44,006s
sys     103m54,935s
(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Druckschriften$ time ketos compile --format-type xml --files digi-gt.shuf --workers 36 -o /data/stweil/digi-gt.arrow
Extracting lines ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  99% 3198/3240 -:--:-- -:--:--
Output file written to /data/stweil/digi-gt.arrow

real    0m31,222s
user    11m33,226s
sys     0m47,420s
(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Druckschriften$ time ketos compile --format-type xml --files digitue-gt.shuf --workers 36 -o /data/stweil/digitue-gt.arrow
[05/15/23 15:43:07] WARNING  Invalid line 30 in /data/stweil/src/github/UB-Mannheim/digitue-gt/Theo/Die_paepstliche_Unfehlbarkeit/2_7cd63_default.jpg: Line polygon outside of image bounds        arrow_dataset.py:57
[05/15/23 15:43:15] WARNING  Invalid line 41 in /data/stweil/src/github/UB-Mannheim/digitue-gt/Theo/Die_paepstliche_Unfehlbarkeit/4_b74f9_default.jpg: Line polygon outside of image bounds        arrow_dataset.py:57
Extracting lines ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 9876/9883 -:--:-- -:--:--
Output file written to /data/stweil/digitue-gt.arrow

real    0m38,540s
user    19m13,052s
sys     0m36,279s

(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Druckschriften$ ketos train -d cuda:0 -f binary -o models/austriannewspapers -r 0.003 --precision 16 --batch-size 9 /data/stweil/austriannewspapers.arrow -s '[1,120,0,1 Cr3,13,32 Do0.1,2 Mp2,2 Cr3,13,32 Do0.1,2 Mp2,2 Cr3,9,64 Do0.1,2 Mp2,2 Cr3,9,64 Do0.1,2 S1(1x0)1,3 Lbx200 Do0.1,2 Lbx200 Do.1,2 Lbx200 Do]' 
scikit-learn version 1.2.2 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
You are using a CUDA device ('NVIDIA RTX A5000') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
┏━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃    ┃ Name      ┃ Type                     ┃ Params ┃                 In sizes ┃                Out sizes ┃
┡━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 0  │ val_cer   │ CharErrorRate            │      0 │                        ? │                        ? │
│ 1  │ val_wer   │ WordErrorRate            │      0 │                        ? │                        ? │
│ 2  │ net       │ MultiParamSequential     │  4.1 M │  [[1, 1, 120, 400], '?'] │   [[1, 188, 1, 50], '?'] │
│ 3  │ net.C_0   │ ActConv2D                │  1.3 K │  [[1, 1, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 4  │ net.Do_1  │ Dropout                  │      0 │ [[1, 32, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 5  │ net.Mp_2  │ MaxPool                  │      0 │ [[1, 32, 120, 400], '?'] │  [[1, 32, 60, 200], '?'] │
│ 6  │ net.C_3   │ ActConv2D                │ 40.0 K │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 7  │ net.Do_4  │ Dropout                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 8  │ net.Mp_5  │ MaxPool                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 30, 100], '?'] │
│ 9  │ net.C_6   │ ActConv2D                │ 55.4 K │  [[1, 32, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 10 │ net.Do_7  │ Dropout                  │      0 │  [[1, 64, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 11 │ net.Mp_8  │ MaxPool                  │      0 │  [[1, 64, 30, 100], '?'] │   [[1, 64, 15, 50], '?'] │
│ 12 │ net.C_9   │ ActConv2D                │  110 K │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 13 │ net.Do_10 │ Dropout                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 14 │ net.S_11  │ Reshape                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 960, 1, 50], '?'] │
│ 15 │ net.L_12  │ TransposedSummarizingRNN │  1.9 M │   [[1, 960, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 16 │ net.Do_13 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 17 │ net.L_14  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 18 │ net.Do_15 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 19 │ net.L_16  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 20 │ net.Do_17 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 21 │ net.O_18  │ LinSoftmax               │ 75.4 K │   [[1, 400, 1, 50], '?'] │   [[1, 188, 1, 50], '?'] │
└────┴───────────┴──────────────────────────┴────────┴──────────────────────────┴──────────────────────────┘
Trainable params: 4.1 M                                                                                                                                                                                               
Non-trainable params: 0                                                                                                                                                                                               
Total params: 4.1 M                                                                                                                                                                                                   
Total estimated model params size (MB): 16                                                                                                                                                                            
stage 0/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5985/5985 0:09:45 • 0:00:00 10.47it/s val_accuracy: 0.017 val_word_accuracy: 0.0  early_stopping: 0/10 0.01676
stage 1/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5985/5985 0:09:44 • 0:00:00 10.21it/s val_accuracy: 0.001 val_word_accuracy: 0.0  early_stopping: 1/10 0.01676
stage 2/∞ ━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 380/5985 0:00:39 • 0:09:38 9.71it/s val_accuracy: 0.001 val_word_accuracy: 0.0  early_stopping: 1/10 0.01676

(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Druckschriften$ ketos train -d cuda:0 -f binary -o models/austriannewspapers -r 0.0003 --precision 16 --batch-size 9 /data/stweil/austriannewspapers.arrow -s 
'[1,120,0,1 Cr3,13,32 Do0.1,2 Mp2,2 Cr3,13,32 Do0.1,2 Mp2,2 Cr3,9,64 Do0.1,2 Mp2,2 Cr3,9,64 Do0.1,2 S1(1x0)1,3 Lbx200 Do0.1,2 Lbx200 Do.1,2 Lbx200 Do]' 
scikit-learn version 1.2.2 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
You are using a CUDA device ('NVIDIA RTX A5000') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
┏━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃    ┃ Name      ┃ Type                     ┃ Params ┃                 In sizes ┃                Out sizes ┃
┡━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 0  │ val_cer   │ CharErrorRate            │      0 │                        ? │                        ? │
│ 1  │ val_wer   │ WordErrorRate            │      0 │                        ? │                        ? │
│ 2  │ net       │ MultiParamSequential     │  4.1 M │  [[1, 1, 120, 400], '?'] │   [[1, 188, 1, 50], '?'] │
│ 3  │ net.C_0   │ ActConv2D                │  1.3 K │  [[1, 1, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 4  │ net.Do_1  │ Dropout                  │      0 │ [[1, 32, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 5  │ net.Mp_2  │ MaxPool                  │      0 │ [[1, 32, 120, 400], '?'] │  [[1, 32, 60, 200], '?'] │
│ 6  │ net.C_3   │ ActConv2D                │ 40.0 K │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 7  │ net.Do_4  │ Dropout                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 8  │ net.Mp_5  │ MaxPool                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 30, 100], '?'] │
│ 9  │ net.C_6   │ ActConv2D                │ 55.4 K │  [[1, 32, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 10 │ net.Do_7  │ Dropout                  │      0 │  [[1, 64, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 11 │ net.Mp_8  │ MaxPool                  │      0 │  [[1, 64, 30, 100], '?'] │   [[1, 64, 15, 50], '?'] │
│ 12 │ net.C_9   │ ActConv2D                │  110 K │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 13 │ net.Do_10 │ Dropout                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 14 │ net.S_11  │ Reshape                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 960, 1, 50], '?'] │
│ 15 │ net.L_12  │ TransposedSummarizingRNN │  1.9 M │   [[1, 960, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 16 │ net.Do_13 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 17 │ net.L_14  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 18 │ net.Do_15 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 19 │ net.L_16  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 20 │ net.Do_17 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 21 │ net.O_18  │ LinSoftmax               │ 75.4 K │   [[1, 400, 1, 50], '?'] │   [[1, 188, 1, 50], '?'] │
└────┴───────────┴──────────────────────────┴────────┴──────────────────────────┴──────────────────────────┘
Trainable params: 4.1 M                                                                                                                                                                                               
Non-trainable params: 0                                                                                                                                                                                               
Total params: 4.1 M                                                                                                                                                                                                   
Total estimated model params size (MB): 16                                                                                                                                                                            
stage 0/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5985/5985 0:09:55 • 0:00:00 9.81it/s val_accuracy: 0.97 val_word_accuracy: 0.868  early_stopping: 0/10 0.96973
stage 1/∞ ━━━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━ 1947/5985 0:03:17 • 0:07:01 9.60it/s val_accuracy: 0.97 val_word_accuracy: 0.868  early_stopping: 0/10 0.96973


(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Druckschriften$ ketos train -d cuda:0 -f binary -o models/austriannewspapers -r 0.0003 --precision 16 --batch-size 9 /data/stweil/austriannewspapers.arrow -s '[1,120,0,1 Cr3,13,32 Do0.1,2 Mp2,2 Cr3,13,32 Do0.1,2 Mp2,2 Cr3,9,64 Do0.1,2 Mp2,2 Cr3,9,64 Do0.1,2 S1(1x0)1,3 Lbx200 Do0.1,2 Lbx200 Do.1,2 Lbx200 Do]' ^C
(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Druckschriften$ ketos train -d cuda:0 -f binary -o models/german_print -r 0.0003 --precision 16 --batch-size 9 /data/stweil/german_print2.arrow -s '[1,120,0,1
 Cr3,13,32 Do0.1,2 Mp2,2 Cr3,13,32 Do0.1,2 Mp2,2 Cr3,9,64 Do0.1,2 Mp2,2 Cr3,9,64 Do0.1,2 S1(1x0)1,3 Lbx200 Do0.1,2 Lbx200 Do.1,2 Lbx200 Do]' 
scikit-learn version 1.2.2 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
You are using a CUDA device ('NVIDIA RTX A5000') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
┏━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃    ┃ Name      ┃ Type                     ┃ Params ┃                 In sizes ┃                Out sizes ┃
┡━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 0  │ val_cer   │ CharErrorRate            │      0 │                        ? │                        ? │
│ 1  │ val_wer   │ WordErrorRate            │      0 │                        ? │                        ? │
│ 2  │ net       │ MultiParamSequential     │  4.1 M │  [[1, 1, 120, 400], '?'] │   [[1, 284, 1, 50], '?'] │
│ 3  │ net.C_0   │ ActConv2D                │  1.3 K │  [[1, 1, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 4  │ net.Do_1  │ Dropout                  │      0 │ [[1, 32, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 5  │ net.Mp_2  │ MaxPool                  │      0 │ [[1, 32, 120, 400], '?'] │  [[1, 32, 60, 200], '?'] │
│ 6  │ net.C_3   │ ActConv2D                │ 40.0 K │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 7  │ net.Do_4  │ Dropout                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 8  │ net.Mp_5  │ MaxPool                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 30, 100], '?'] │
│ 9  │ net.C_6   │ ActConv2D                │ 55.4 K │  [[1, 32, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 10 │ net.Do_7  │ Dropout                  │      0 │  [[1, 64, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 11 │ net.Mp_8  │ MaxPool                  │      0 │  [[1, 64, 30, 100], '?'] │   [[1, 64, 15, 50], '?'] │
│ 12 │ net.C_9   │ ActConv2D                │  110 K │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 13 │ net.Do_10 │ Dropout                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 14 │ net.S_11  │ Reshape                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 960, 1, 50], '?'] │
│ 15 │ net.L_12  │ TransposedSummarizingRNN │  1.9 M │   [[1, 960, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 16 │ net.Do_13 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 17 │ net.L_14  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 18 │ net.Do_15 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 19 │ net.L_16  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 20 │ net.Do_17 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 21 │ net.O_18  │ LinSoftmax               │  113 K │   [[1, 400, 1, 50], '?'] │   [[1, 284, 1, 50], '?'] │
└────┴───────────┴──────────────────────────┴────────┴──────────────────────────┴──────────────────────────┘
Trainable params: 4.1 M                                                                                                                                                                                               
Non-trainable params: 0                                                                                                                                                                                               
Total params: 4.1 M                                                                                                                                                                                                   
Total estimated model params size (MB): 16                                                                                                                                                                            
stage 0/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19223/19223 0:31:35 • 0:00:00 9.71it/s val_accuracy: 0.976 val_word_accuracy: 0.908  early_stopping: 0/10 0.97623
stage 1/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19223/19223 0:32:35 • 0:00:00 9.80it/s val_accuracy: 0.982 val_word_accuracy: 0.935  early_stopping: 0/10 0.98221
stage 2/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19223/19223 0:32:18 • 0:00:00 9.83it/s val_accuracy: 0.984 val_word_accuracy: 0.944  early_stopping: 0/10 0.98397
stage 3/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19223/19223 0:32:26 • 0:00:00 9.80it/s val_accuracy: 0.985 val_word_accuracy: 0.949  early_stopping: 0/10 0.98510
stage 4/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19223/19223 0:32:21 • 0:00:00 9.66it/s val_accuracy: 0.986 val_word_accuracy: 0.952  early_stopping: 0/10 0.98584
stage 5/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19223/19223 0:32:14 • 0:00:00 10.21it/s val_accuracy: 0.985 val_word_accuracy: 0.951  early_stopping: 1/10 0.98584
stage 6/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19223/19223 0:32:15 • 0:00:00 9.64it/s val_accuracy: 0.986 val_word_accuracy: 0.955  early_stopping: 0/10 0.98622
stage 7/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19223/19223 0:32:00 • 0:00:00 9.91it/s val_accuracy: 0.986 val_word_accuracy: 0.955  early_stopping: 0/10 0.98626
stage 8/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19223/19223 0:32:31 • 0:00:00 9.85it/s val_accuracy: 0.986 val_word_accuracy: 0.955  early_stopping: 1/10 0.98626
stage 9/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19223/19223 0:31:18 • 0:00:00 10.04it/s val_accuracy: 0.986 val_word_accuracy: 0.955  early_stopping: 1/10 0.98626
stage 9/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19223/19223 0:31:18 • 0:00:00 10.04it/s val_accuracy: 0.986 val_word_accuracy: 0.955  early_stopping: 2/10 0.98626
stage 10/∞ ━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━ 12009/19223 0:20:14 • 0:12:26 9.68it/s val_accuracy: 0.986 val_word_accuracy: 0.955  early_stopping: 2/10 0.98626
stage 10/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━ 12016/19223 0:20:15 • 0:12:23 9.70it/s val_accuracy: 0.986 val_word_accuracy: 0.955  early_stopping: 2/10 0.98626
stage 10/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━ 12019/19223 0:20:15 • 0:12:20 9.75it/s val_accuracy: 0.986 val_word_accuracy: 0.955  early_stopping: 2/10 0.98626
stage 10/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━ 12395/19223 0:20:53 • 0:11:33 9.86it/s val_accuracy: 0.986 val_word_accuracy: 0.955  early_stopping: 2/10 0.98626
stage 10/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19223/19223 0:32:21 • 0:00:00 9.53it/s  val_accuracy: 0.986 val_word_accuracy: 0.955  early_stopping: 2/10 0.98626
Validation ━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 238/2136    0:00:09 • 0:01:16 25.23it/s                                               early_stopping: 2/10 0.98626

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training German Print

Training of kraken model for German print

Ground Truth

Preparing data for training

Training with large dataset

Future trainings

Clone this wiki locally