Skip to content

Training German Handwriting

Stefan Weil edited this page Jan 21, 2024 · 14 revisions

Training of kraken model for German handwriting

german_handwriting is a model which was trained for the recognition of handwritten (German) texts.

The first training used 4 ground truth data sets. A 2nd training added another data set, and the 3rd training added one more again. All three trainings were based on the model digitue_best, a kraken model for printed text.

The latest model file german_handwriting.mlmodel for Kraken OCR is available from Zenodo:
Weil, S. (2023). HTR model for German manuscripts trained from several datasets. Zenodo. https://doi.org/10.5281/zenodo.7933463

Data from all training processes including intermediate results can be found at https://ub-backup.bib.uni-mannheim.de/~stweil/tesstrain/kraken/german_handwriting/.

Ground Truth

1st Training

Konsilien 1659-1665. Tübingen. http://doi.org/10.20345/digitue.23865

Tobias Grüning, Gundram Leifert, Johannes Michael, Tobias Strauß, Max Weidemann, Roger Labahn. (2016). read_dataset_german_konzilsprotokolle [Data set]. Zenodo. http://doi.org/10.5281/zenodo.215383

Sánchez, Joan Andreu, Romero, Verónica, Toselli, Alejandro H., & Vidal, Enrique. (2016). READ dataset Bozen [Data set]. Zenodo. https://doi.org/10.5281/zenodo.218236

Hodel, Tobias, Schoch, David, & Dängeli, Peter. (2021). Handwritten Text Recognition Ground Truth Set: StABS Ratsbücher O10, Urfehdenbuch X (1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.5153263

2nd Training

Like 1st training plus

https://github.com/ubtue/Ground-Truth/

3rd Training

Like 2nd training plus data exported from Transkribus

Kurrentschrift from the validation set of Staatsarchiv Zürich Regierungsratsbeschlüsse (HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX)

Preparing data for training

wget -m https://zenodo.org/record/215383/files/german_konzilsprotokolle.tar.gz
tar xzf zenodo.org/record/215383/files/german_konzilsprotokolle.tar.gz
(
cd gt/215383/...
for dir in Copy_of_*; do (cd $dir; ln -sv page/*.xml .); done
)

wget -m https://zenodo.org/record/218236/files/PublicData.tgz
tar xzf zenodo.org/record/218236/files/PublicData.tgz

wget -m https://zenodo.org/record/5153263/files/StABS_Ratsbuch_O_10.zip
unzip zenodo.org/record/5153263/files/StABS_Ratsbuch_O_10.zip 
cd gt/5153263/StABS_Ratsbuch_O_10/page
ln -sv ../*.jpg ../*.png .
for in in *.jpg; do out=$(echo $in|sed s/Rats.*_0*//); mv -v $in $out; done
ls gt/215383/german_konzilsprotokolle/data/Greifswald_Alvermann/Copy_of_*/0*.xml >>list.train
ls gt/218236/PublicData/*/page/*xml >>list.train 
ls gt/5153263/StABS_Ratsbuch_O_10/page/*.xml >>list.train 
ls digitue/*/*xml >> list.train

shuf < list.train | shuf >list1.train

Training with small dataset (Konzilsprotokolle)

ketos train -d cuda:0 --workers 4 -f xml Handschriften/gt/215383/german_konzilsprotokolle/data/Greifswald_Alvermann/Copy_of_*/0*.xml

about 24 min / epoch

Pretraining with large dataset

(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Handschriften$ ketos pretrain -d cuda:0 -f page -t list.shuf.train -o pretrain/german_handwriting
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
┏━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃    ┃ Name                   ┃ Type                     ┃ Params ┃
┡━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ 0  │ net                    │ MultiParamSequential     │  4.0 M │
│ 1  │ net.C_0                │ ActConv2D                │  1.3 K │
│ 2  │ net.Do_1               │ Dropout                  │      0 │
│ 3  │ net.Mp_2               │ MaxPool                  │      0 │
│ 4  │ net.C_3                │ ActConv2D                │ 40.0 K │
│ 5  │ net.Do_4               │ Dropout                  │      0 │
│ 6  │ net.Mp_5               │ MaxPool                  │      0 │
│ 7  │ net.C_6                │ ActConv2D                │ 55.4 K │
│ 8  │ net.Do_7               │ Dropout                  │      0 │
│ 9  │ net.Mp_8               │ MaxPool                  │      0 │
│ 10 │ net.C_9                │ ActConv2D                │  110 K │
│ 11 │ net.Do_10              │ Dropout                  │      0 │
│ 12 │ net.S_11               │ Reshape                  │      0 │
│ 13 │ net.L_12               │ TransposedSummarizingRNN │  1.9 M │
│ 14 │ net.Do_13              │ Dropout                  │      0 │
│ 15 │ net.L_14               │ TransposedSummarizingRNN │  963 K │
│ 16 │ net.Do_15              │ Dropout                  │      0 │
│ 17 │ net.L_16               │ TransposedSummarizingRNN │  963 K │
│ 18 │ net.Do_17              │ Dropout                  │      0 │
│ 19 │ features               │ MultiParamSequential     │  207 K │
│ 20 │ wav2vec2mask           │ Wav2Vec2Mask             │  388 K │
│ 21 │ wav2vec2mask.mask_emb  │ Embedding                │  3.8 K │
│ 22 │ wav2vec2mask.project_q │ Linear                   │  384 K │
│ 23 │ encoder                │ MultiParamSequential     │  3.8 M │
└────┴────────────────────────┴──────────────────────────┴────────┘
Trainable params: 4.4 M                                                                                                                                                                                               
Non-trainable params: 0                                                                                                                                                                                               
Total params: 4.4 M                                                                                                                                                                                                   
Total estimated model params size (MB): 17                                                                                                                                                                            
Validation Sanity Check ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/66 -:--:-- 0:00:00   

Training with large dataset

ketos train -d cuda:0 -f xml -i /home/stweil/.config/kraken/digitue_best.mlmodel -t list.shuf.train -o 202211261525/german_handwriting --resize add -r 0.0001

                    WARNING  Text line "" is empty after transformations                                                                                                                                  train.py:361
[...]
                    WARNING  Text line "" is empty after transformations                                                                                                                                  train.py:361
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
┏━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃    ┃ Name      ┃ Type                     ┃ Params ┃                 In sizes ┃                Out sizes ┃
┡━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 0  │ net       │ MultiParamSequential     │  4.1 M │  [[1, 1, 120, 400], '?'] │   [[1, 279, 1, 50], '?'] │
│ 1  │ net.C_0   │ ActConv2D                │  1.3 K │  [[1, 1, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 2  │ net.Do_1  │ Dropout                  │      0 │ [[1, 32, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 3  │ net.Mp_2  │ MaxPool                  │      0 │ [[1, 32, 120, 400], '?'] │  [[1, 32, 60, 200], '?'] │
│ 4  │ net.C_3   │ ActConv2D                │ 40.0 K │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 5  │ net.Do_4  │ Dropout                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 6  │ net.Mp_5  │ MaxPool                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 30, 100], '?'] │
│ 7  │ net.C_6   │ ActConv2D                │ 55.4 K │  [[1, 32, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 8  │ net.Do_7  │ Dropout                  │      0 │  [[1, 64, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 9  │ net.Mp_8  │ MaxPool                  │      0 │  [[1, 64, 30, 100], '?'] │   [[1, 64, 15, 50], '?'] │
│ 10 │ net.C_9   │ ActConv2D                │  110 K │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 11 │ net.Do_10 │ Dropout                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 12 │ net.S_11  │ Reshape                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 960, 1, 50], '?'] │
│ 13 │ net.L_12  │ TransposedSummarizingRNN │  1.9 M │   [[1, 960, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 14 │ net.Do_13 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 15 │ net.L_14  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 16 │ net.Do_15 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 17 │ net.L_16  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 18 │ net.Do_17 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 19 │ net.O_18  │ LinSoftmax               │  111 K │   [[1, 400, 1, 50], '?'] │   [[1, 279, 1, 50], '?'] │
└────┴───────────┴──────────────────────────┴────────┴──────────────────────────┴──────────────────────────┘
Trainable params: 4.1 M                                                                                                                                                                                               
Non-trainable params: 0                                                                                                                                                                                               
Total params: 4.1 M                                                                                                                                                                                                   
Total estimated model params size (MB): 16                                                                                                                                                                            
stage 0/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:30:54 val_accuracy: 0.67250  early_stopping: 0/5 0.67250
stage 1/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:30:29 val_accuracy: 0.77690  early_stopping: 0/5 0.77690
stage 2/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:30:37 val_accuracy: 0.82208  early_stopping: 0/5 0.82208
stage 3/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:31:23 val_accuracy: 0.84496  early_stopping: 0/5 0.84496
stage 4/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:32:29 val_accuracy: 0.86469  early_stopping: 0/5 0.86469
stage 5/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:31:10 val_accuracy: 0.87568  early_stopping: 0/5 0.87568
stage 6/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:31:30 val_accuracy: 0.88534  early_stopping: 0/5 0.88534
stage 7/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:31:01 val_accuracy: 0.89373  early_stopping: 0/5 0.89373
stage 8/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:30:58 val_accuracy: 0.89676  early_stopping: 0/5 0.89676
stage 9/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:27:53 val_accuracy: 0.90261  early_stopping: 0/5 0.90261
stage 10/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:14 val_accuracy: 0.90775  early_stopping: 0/5 0.90775
stage 11/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:10 val_accuracy: 0.90983  early_stopping: 0/5 0.90983
stage 12/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:22 val_accuracy: 0.91347  early_stopping: 0/5 0.91347
stage 13/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:57 val_accuracy: 0.91408  early_stopping: 0/5 0.91408
stage 14/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:11 val_accuracy: 0.91815  early_stopping: 0/5 0.91815
stage 15/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:22:18 val_accuracy: 0.91959  early_stopping: 0/5 0.91959
stage 16/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:47 val_accuracy: 0.92043  early_stopping: 0/5 0.92043
stage 17/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:28 val_accuracy: 0.92369  early_stopping: 0/5 0.92369
stage 18/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:58 val_accuracy: 0.92322  early_stopping: 1/5 0.92369
stage 19/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:21:10 val_accuracy: 0.92586  early_stopping: 0/5 0.92586
stage 20/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:12 val_accuracy: 0.76328  early_stopping: 1/5 0.92586
stage 21/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:32 val_accuracy: 0.92818  early_stopping: 0/5 0.92818
stage 22/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:05 val_accuracy: 0.92855  early_stopping: 0/5 0.92855
stage 23/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:57 val_accuracy: 0.92989  early_stopping: 0/5 0.92989
stage 24/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:08 val_accuracy: 0.93016  early_stopping: 0/5 0.93016
stage 25/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:04 val_accuracy: 0.93197  early_stopping: 0/5 0.93197
stage 26/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:22:27 val_accuracy: 0.93215  early_stopping: 0/5 0.93215
stage 27/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:59 val_accuracy: 0.93361  early_stopping: 0/5 0.93361
stage 28/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:20:37 val_accuracy: 0.93408  early_stopping: 0/5 0.93408
stage 29/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:35 val_accuracy: 0.93470  early_stopping: 0/5 0.93470
stage 30/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:24 val_accuracy: 0.93382  early_stopping: 1/5 0.93470
stage 31/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:22:34 val_accuracy: 0.93630  early_stopping: 0/5 0.93630
stage 32/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:10 val_accuracy: 0.92988  early_stopping: 1/5 0.93630
stage 33/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:50 val_accuracy: 0.93543  early_stopping: 2/5 0.93630
stage 34/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:06 val_accuracy: 0.81099  early_stopping: 3/5 0.93630
stage 35/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:22:24 val_accuracy: 0.93757  early_stopping: 0/5 0.93757
stage 36/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:32:17 val_accuracy: 0.93726  early_stopping: 1/5 0.93757
stage 37/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:43 val_accuracy: 0.93821  early_stopping: 0/5 0.93821
stage 38/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:05 val_accuracy: 0.93877  early_stopping: 0/5 0.93877
stage 39/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:22:21 val_accuracy: 0.93811  early_stopping: 1/5 0.93877
stage 40/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:15 val_accuracy: 0.93799  early_stopping: 2/5 0.93877
stage 41/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 5:54:09 val_accuracy: 0.93705  early_stopping: 3/5 0.93877
stage 42/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:33 val_accuracy: 0.93911  early_stopping: 0/5 0.93911
stage 43/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 5:54:42 val_accuracy: 0.93913  early_stopping: 0/5 0.93913
stage 44/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 4:15:59 val_accuracy: 0.93914  early_stopping: 0/5 0.93914
stage 45/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 7:06:10 val_accuracy: 0.94090  early_stopping: 0/5 0.94090
stage 46/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 7:35:41 val_accuracy: 0.93956  early_stopping: 1/5 0.94090
stage 47/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 4:49:43 val_accuracy: 0.94088  early_stopping: 2/5 0.94090
stage 48/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:21:49 val_accuracy: 0.94045  early_stopping: 3/5 0.94090
stage 49/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 4:40:36 val_accuracy: 0.94002  early_stopping: 4/5 0.94090
stage 50/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 7:16:09 val_accuracy: 0.93965  early_stopping: 5/5 0.94090
Moving best model 202211261525/german_handwriting_45.mlmodel (0.940900444984436) to 202211261525/german_handwriting_best.mlmodel

real    11685m35,360s
user    20753m55,042s
sys     27866m56,116s

Training 2023-05-11

The training data was augmented with all GT from https://github.com/ubtue/Ground-Truth/.

The first try to use the PAGE XML files directly for training required more than 6 hours per epoch:

Trainable params: 4.1 M                                                         
Non-trainable params: 0                                                         
Total params: 4.1 M                                                             
Total estimated model params size (MB): 16                                      
stage 0/∞ ━━━━━━━━━━━ 48609/48609 6:26:43 •   2.11it/s val_accura… early_stoppi…
                                  0:00:00              0.664       0/10 0.66353 
                                                       val_word_a…              
                                                       0.262                    

Therefore another try was made using binary data and different arguments for ketos. Now the time per epoch was reduced to less than 13 minutes:

(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Handschriften$ time ketos compile --format-type xml --files list1.train --workers 8 -o list.arrow
Extracting lines ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  87% 54011/62100 -:--:-- -:--:--
Output file written to list.arrow

real    78m59,455s
user    975m39,725s
sys     1435m0,902s
(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Handschriften$ OMP_NUM_THREADS=1 ketos train -d cuda:0 -f binary -i /home/stweil/.config/kraken/digitue_best.mlmodel -o german_handwriting --resize add -r 0.002 --precision 16 --batch-size 4 --warmup 1 --freeze-backbone 1 listist.arrow 
scikit-learn version 1.2.2 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
You are using a CUDA device ('NVIDIA RTX A5000') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/ed/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
[05/11/23 08:25:20] WARNING  Neural network has been trained on mode 1 images, training set contains mode L data. Consider setting `force_binarization`                                         train.py:588
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
┏━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃    ┃ Name      ┃ Type                     ┃ Params ┃                 In sizes ┃                Out sizes ┃
┡━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 0  │ val_cer   │ CharErrorRate            │      0 │                        ? │                        ? │
│ 1  │ val_wer   │ WordErrorRate            │      0 │                        ? │                        ? │
│ 2  │ net       │ MultiParamSequential     │  4.1 M │  [[1, 1, 120, 400], '?'] │   [[1, 292, 1, 50], '?'] │
│ 3  │ net.C_0   │ ActConv2D                │  1.3 K │  [[1, 1, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 4  │ net.Do_1  │ Dropout                  │      0 │ [[1, 32, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 5  │ net.Mp_2  │ MaxPool                  │      0 │ [[1, 32, 120, 400], '?'] │  [[1, 32, 60, 200], '?'] │
│ 6  │ net.C_3   │ ActConv2D                │ 40.0 K │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 7  │ net.Do_4  │ Dropout                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 8  │ net.Mp_5  │ MaxPool                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 30, 100], '?'] │
│ 9  │ net.C_6   │ ActConv2D                │ 55.4 K │  [[1, 32, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 10 │ net.Do_7  │ Dropout                  │      0 │  [[1, 64, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 11 │ net.Mp_8  │ MaxPool                  │      0 │  [[1, 64, 30, 100], '?'] │   [[1, 64, 15, 50], '?'] │
│ 12 │ net.C_9   │ ActConv2D                │  110 K │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 13 │ net.Do_10 │ Dropout                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 14 │ net.S_11  │ Reshape                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 960, 1, 50], '?'] │
│ 15 │ net.L_12  │ TransposedSummarizingRNN │  1.9 M │   [[1, 960, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 16 │ net.Do_13 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 17 │ net.L_14  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 18 │ net.Do_15 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 19 │ net.L_16  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 20 │ net.Do_17 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 21 │ net.O_18  │ LinSoftmax               │  117 K │   [[1, 400, 1, 50], '?'] │   [[1, 292, 1, 50], '?'] │
└────┴───────────┴──────────────────────────┴────────┴──────────────────────────┴──────────────────────────┘
Trainable params: 4.1 M                                                                                                                                                                                     
Non-trainable params: 0                                                                                                                                                                                     
Total params: 4.1 M                                                                                                                                                                                         
Total estimated model params size (MB): 16
stage 0/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:42 • 0:00:00 16.39it/s val_accuracy: 0.664 val_word_accuracy: 0.252  early_stopping: 0/10 0.66354
stage 1/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:35 • 0:00:00 15.02it/s val_accuracy: 0.723 val_word_accuracy: 0.371  early_stopping: 0/10 0.72260
stage 2/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:42 • 0:00:00 16.03it/s val_accuracy: 0.754 val_word_accuracy: 0.429  early_stopping: 0/10 0.75389
stage 3/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:38 • 0:00:00 15.40it/s val_accuracy: 0.768 val_word_accuracy: 0.47  early_stopping: 0/10 0.76781
stage 4/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:40 • 0:00:00 16.11it/s val_accuracy: 0.776 val_word_accuracy: 0.495  early_stopping: 0/10 0.77639
stage 5/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:47 • 0:00:00 15.50it/s val_accuracy: 0.784 val_word_accuracy: 0.516  early_stopping: 0/10 0.78355
stage 6/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:38 • 0:00:00 16.00it/s val_accuracy: 0.793 val_word_accuracy: 0.537  early_stopping: 0/10 0.79340
stage 7/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:41 • 0:00:00 16.51it/s val_accuracy: 0.789 val_word_accuracy: 0.531  early_stopping: 1/10 0.79340
stage 8/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:49 • 0:00:00 15.34it/s val_accuracy: 0.812 val_word_accuracy: 0.565  early_stopping: 0/10 0.81226
stage 9/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:41 • 0:00:00 15.52it/s val_accuracy: 0.824 val_word_accuracy: 0.554  early_stopping: 0/10 0.82411
stage 10/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:52 • 0:00:00 16.54it/s val_accuracy: 0.811 val_word_accuracy: 0.581  early_stopping: 1/10 0.82411
stage 11/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:44 • 0:00:00 16.58it/s val_accuracy: 0.819 val_word_accuracy: 0.574  early_stopping: 2/10 0.82411
stage 12/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:53 • 0:00:00 16.47it/s val_accuracy: 0.813 val_word_accuracy: 0.591  early_stopping: 3/10 0.82411
stage 13/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:48 • 0:00:00 16.03it/s val_accuracy: 0.828 val_word_accuracy: 0.596  early_stopping: 0/10 0.82815
stage 14/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:47 • 0:00:00 16.28it/s val_accuracy: 0.824 val_word_accuracy: 0.603  early_stopping: 1/10 0.82815
stage 15/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:47 • 0:00:00 15.72it/s val_accuracy: 0.848 val_word_accuracy: 0.607  early_stopping: 0/10 0.84771
stage 16/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:49 • 0:00:00 15.50it/s val_accuracy: 0.834 val_word_accuracy: 0.612  early_stopping: 1/10 0.84771
stage 17/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:43 • 0:00:00 17.27it/s val_accuracy: 0.842 val_word_accuracy: 0.618  early_stopping: 2/10 0.84771
stage 18/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:44 • 0:00:00 15.52it/s val_accuracy: 0.842 val_word_accuracy: 0.624  early_stopping: 3/10 0.84771
stage 19/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:46 • 0:00:00 16.22it/s val_accuracy: 0.845 val_word_accuracy: 0.625  early_stopping: 4/10 0.84771
stage 20/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:47 • 0:00:00 15.82it/s val_accuracy: 0.826 val_word_accuracy: 0.633  early_stopping: 5/10 0.84771
stage 21/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:52 • 0:00:00 15.96it/s val_accuracy: 0.852 val_word_accuracy: 0.571  early_stopping: 0/10 0.85250
stage 22/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:48 • 0:00:00 16.32it/s val_accuracy: 0.847 val_word_accuracy: 0.638  early_stopping: 1/10 0.85250
stage 23/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:36 • 0:00:00 15.20it/s val_accuracy: 0.845 val_word_accuracy: 0.641  early_stopping: 2/10 0.85250
stage 24/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:30 • 0:00:00 15.79it/s val_accuracy: 0.85 val_word_accuracy: 0.648  early_stopping: 3/10 0.85250
stage 25/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:36 • 0:00:00 16.58it/s val_accuracy: 0.861 val_word_accuracy: 0.636  early_stopping: 0/10 0.86129
stage 26/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:29 • 0:00:00 15.81it/s val_accuracy: 0.849 val_word_accuracy: 0.651  early_stopping: 1/10 0.86129
stage 27/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:36 • 0:00:00 16.41it/s val_accuracy: 0.855 val_word_accuracy: 0.653  early_stopping: 2/10 0.86129
stage 28/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:42 • 0:00:00 16.58it/s val_accuracy: 0.854 val_word_accuracy: 0.652  early_stopping: 3/10 0.86129
stage 29/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:53 • 0:00:00 15.97it/s val_accuracy: 0.853 val_word_accuracy: 0.654  early_stopping: 4/10 0.86129
stage 30/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:52 • 0:00:00 16.24it/s val_accuracy: 0.858 val_word_accuracy: 0.65  early_stopping: 5/10 0.86129
stage 31/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:49 • 0:00:00 15.59it/s val_accuracy: 0.854 val_word_accuracy: 0.661  early_stopping: 6/10 0.86129
stage 32/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:55 • 0:00:00 15.91it/s val_accuracy: 0.863 val_word_accuracy: 0.657  early_stopping: 0/10 0.86270
stage 33/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:42 • 0:00:00 15.82it/s val_accuracy: 0.86 val_word_accuracy: 0.657  early_stopping: 1/10 0.86270
stage 34/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:42 • 0:00:00 16.12it/s val_accuracy: 0.859 val_word_accuracy: 0.653  early_stopping: 2/10 0.86270
stage 35/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:12 • 0:00:00 16.35it/s val_accuracy: 0.854 val_word_accuracy: 0.667  early_stopping: 3/10 0.86270
stage 36/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:05 • 0:00:00 16.84it/s val_accuracy: 0.853 val_word_accuracy: 0.662  early_stopping: 4/10 0.86270
stage 37/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:04 • 0:00:00 16.74it/s val_accuracy: 0.85 val_word_accuracy: 0.665  early_stopping: 5/10 0.86270
stage 38/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:05 • 0:00:00 16.57it/s val_accuracy: 0.849 val_word_accuracy: 0.669  early_stopping: 6/10 0.86270
stage 39/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:09 • 0:00:00 16.65it/s val_accuracy: 0.868 val_word_accuracy: 0.606  early_stopping: 0/10 0.86793
stage 40/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:05 • 0:00:00 16.34it/s val_accuracy: 0.87 val_word_accuracy: 0.611  early_stopping: 0/10 0.86977
stage 41/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:15 • 0:00:00 16.60it/s val_accuracy: 0.854 val_word_accuracy: 0.67  early_stopping: 1/10 0.86977
stage 42/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:07 • 0:00:00 16.27it/s val_accuracy: 0.869 val_word_accuracy: 0.635  early_stopping: 2/10 0.86977
stage 43/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:06 • 0:00:00 17.22it/s val_accuracy: 0.859 val_word_accuracy: 0.671  early_stopping: 3/10 0.86977
stage 44/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:10 • 0:00:00 16.87it/s val_accuracy: 0.854 val_word_accuracy: 0.671  early_stopping: 4/10 0.86977
stage 45/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:04 • 0:00:00 16.78it/s val_accuracy: 0.866 val_word_accuracy: 0.628  early_stopping: 5/10 0.86977
stage 46/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:06 • 0:00:00 17.11it/s val_accuracy: 0.867 val_word_accuracy: 0.666  early_stopping: 6/10 0.86977
stage 47/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:25 • 0:00:00 15.82it/s val_accuracy: 0.87 val_word_accuracy: 0.603  early_stopping: 7/10 0.86977
stage 48/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:51 • 0:00:00 15.67it/s val_accuracy: 0.871 val_word_accuracy: 0.618  early_stopping: 0/10 0.87095
stage 49/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:40 • 0:00:00 16.49it/s val_accuracy: 0.871 val_word_accuracy: 0.61  early_stopping: 1/10 0.87095
stage 50/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:39 • 0:00:00 15.91it/s val_accuracy: 0.872 val_word_accuracy: 0.611  early_stopping: 0/10 0.87216
stage 51/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:34 • 0:00:00 15.60it/s val_accuracy: 0.872 val_word_accuracy: 0.614  early_stopping: 0/10 0.87234
stage 52/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:43 • 0:00:00 15.51it/s val_accuracy: 0.872 val_word_accuracy: 0.61  early_stopping: 1/10 0.87234
stage 53/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:45 • 0:00:00 16.39it/s val_accuracy: 0.8 val_word_accuracy: 0.544  early_stopping: 2/10 0.87234
stage 54/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:51 • 0:00:00 14.91it/s val_accuracy: 0.868 val_word_accuracy: 0.678  early_stopping: 3/10 0.87234
stage 55/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:33 • 0:00:00 16.22it/s val_accuracy: 0.867 val_word_accuracy: 0.64  early_stopping: 4/10 0.87234
stage 56/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:42 • 0:00:00 15.85it/s val_accuracy: 0.853 val_word_accuracy: 0.678  early_stopping: 5/10 0.87234
stage 57/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:12:28 • 0:00:00 15.93it/s val_accuracy: 0.868 val_word_accuracy: 0.679  early_stopping: 6/10 0.87234
stage 58/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:14:02 • 0:00:00 13.34it/s val_accuracy: 0.866 val_word_accuracy: 0.664  early_stopping: 7/10 0.87234
stage 59/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:15:49 • 0:00:00 12.17it/s val_accuracy: 0.869 val_word_accuracy: 0.675  early_stopping: 8/10 0.87234
stage 60/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:15:45 • 0:00:00 13.02it/s val_accuracy: 0.871 val_word_accuracy: 0.682  early_stopping: 9/10 0.87234
stage 61/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12153/12153 0:15:41 • 0:00:00 13.10it/s val_accuracy: 0.871 val_word_accuracy: 0.668  early_stopping: 10/10 0.87234
Moving best model german_handwriting_51.mlmodel (0.8723417520523071) to german_handwriting_best.mlmodel

Training 2023-05-12

This training used additional GT for Kurrentschrift from the validation set of Staatsarchiv Zürich Regierungsratsbeschlüsse (HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX).

(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Handschriften$ time ketos compile --format-type xml --files list-20230511.shuf --workers 8 -o /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/lis
t-20230511.arrow
[05/11/23 17:21:17] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000390_1.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:21:18] WARNING  Invalid line 5 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000390_1.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
                    WARNING  Invalid line 6 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000390_1.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:21:20] WARNING  Invalid line 10 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000390_1.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 17:21:23] WARNING  Invalid line 18 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000390_1.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 17:21:45] WARNING  Invalid line 8 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000199_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:23:29] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000354_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
                    WARNING  Invalid line 1 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000354_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:30:54] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000147_1.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:34:16] WARNING  Invalid line 9 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000226_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
                    WARNING  Invalid line 10 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000226_2.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 17:35:06] WARNING  Invalid line 8 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000149_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:35:49] WARNING  Invalid line 40 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000164_2.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 17:36:09] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000145_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:37:09] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000184_1.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:37:56] WARNING  Invalid line 55 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000257_1.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
                    WARNING  Invalid line 56 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000257_1.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 17:38:02] WARNING  Invalid line 9 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000055_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:39:27] WARNING  Invalid line 2 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000230_1.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:39:30] WARNING  Invalid line 9 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000230_1.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:43:17] WARNING  Invalid line 7 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000220_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:43:18] WARNING  Invalid line 9 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000220_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:43:34] WARNING  Invalid line 23 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000220_2.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 17:44:19] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000053_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:44:54] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000139_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:48:16] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000505_1.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:49:20] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000023_1.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:50:43] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000373_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
                    WARNING  Invalid line 1 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000373_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
                    WARNING  Invalid line 2 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000373_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
                    WARNING  Invalid line 3 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000373_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
                    WARNING  Invalid line 4 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000373_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:50:44] WARNING  Invalid line 13 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000373_2.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
                    WARNING  Invalid line 14 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000373_2.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 17:55:07] WARNING  Invalid line 20 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000307_2.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 17:55:22] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000277_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:57:01] WARNING  Invalid line 2 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000239_1.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:57:50] WARNING  Invalid line 1 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000185_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:57:54] WARNING  Invalid line 13 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000380_2.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 17:58:34] WARNING  Invalid line 5 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000448_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:59:14] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000319_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
                    WARNING  Invalid line 1 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000319_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:59:15] WARNING  Invalid line 6 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000319_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
                    WARNING  Invalid line 7 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000319_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 17:59:18] WARNING  Invalid line 21 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000319_2.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 18:06:20] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000330_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
                    WARNING  Invalid line 2 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000330_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
                    WARNING  Invalid line 3 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000330_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
                    WARNING  Invalid line 5 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000330_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 18:07:21] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000072_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 18:09:59] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000111_1.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 18:15:15] WARNING  Invalid line 49 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000197_1.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
                    WARNING  Invalid line 50 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000197_1.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 18:20:21] WARNING  Invalid line 1 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000454_1.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 18:26:28] WARNING  Invalid line 1 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000304_1.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 18:26:46] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000175_1.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 18:28:55] WARNING  Invalid line 25 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000443_1.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 18:30:09] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000505_2.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 18:31:07] WARNING  Invalid line 54 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000367_1.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
                    WARNING  Invalid line 55 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000367_1.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
                    WARNING  Invalid line 56 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000367_1.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 18:31:46] WARNING  Invalid line 47 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000162_1.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 18:32:37] WARNING  Invalid line 15 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/Dose_167-569.jpg: Line polygon outside of image bounds                           arrow_dataset.py:57
[05/11/23 18:33:55] WARNING  Invalid line 16 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000030_2.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 18:34:38] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000400_1.jpg: Line polygon outside of image bounds                                arrow_dataset.py:57
[05/11/23 18:35:37] WARNING  Invalid line 10 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000316_2.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 18:35:44] WARNING  Invalid line 15 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000316_2.jpg: Line polygon outside of image bounds                               arrow_dataset.py:57
[05/11/23 18:39:47] WARNING  Invalid line 5 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000068_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 18:42:45] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/2_000244_2.jpg: Line polygon outside of image bounds                                                                  arrow_dataset.py:57
[05/11/23 18:43:42] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000043_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 18:46:37] WARNING  Invalid line 4 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000258_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
                    WARNING  Invalid line 6 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000258_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 18:47:35] WARNING  Invalid line 11 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000046_1.jpg: Line polygon outside of image bounds                                                                   arrow_dataset.py:57
[05/11/23 18:50:00] WARNING  Invalid line 28 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000010_2.jpg: Line polygon outside of image bounds                                                                   arrow_dataset.py:57
[05/11/23 18:50:22] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000259_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 18:52:02] WARNING  Invalid line 4 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/Dose_167-168.jpg: Line polygon outside of image bounds                                                                arrow_dataset.py:57
                    WARNING  Invalid line 5 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/Dose_167-168.jpg: Line polygon outside of image bounds                                                                arrow_dataset.py:57
[05/11/23 18:53:12] WARNING  Invalid line 22 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000274_2.jpg: Line polygon outside of image bounds                                                                   arrow_dataset.py:57
[05/11/23 18:56:58] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000192_1.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:00:45] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000298_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:02:01] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000321_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:02:26] WARNING  Invalid line 47 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000321_2.jpg: Line polygon outside of image bounds                                                                   arrow_dataset.py:57
[05/11/23 19:05:29] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000040_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:09:08] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000262_1.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:11:19] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000282_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:13:12] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000439_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
                    WARNING  Invalid line 1 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000439_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:14:53] WARNING  Invalid line 30 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000146_1.jpg: Line polygon outside of image bounds                                                                   arrow_dataset.py:57
[05/11/23 19:16:40] WARNING  Invalid line 17 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000289_2.jpg: Line polygon outside of image bounds                                                                   arrow_dataset.py:57
[05/11/23 19:16:53] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000430_1.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:17:41] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000202_1.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
                    WARNING  Invalid line 2 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000202_1.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
                    WARNING  Invalid line 3 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000202_1.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:17:44] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000209_1.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:20:25] WARNING  Invalid line 1 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000188_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:27:27] WARNING  Invalid line 16 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000348_2.jpg: Line polygon outside of image bounds                                                                   arrow_dataset.py:57
                    WARNING  Invalid line 17 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000348_2.jpg: Line polygon outside of image bounds                                                                   arrow_dataset.py:57
[05/11/23 19:27:30] WARNING  Invalid line 34 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/2_000036_2.jpg: Line polygon outside of image bounds                                                                 arrow_dataset.py:57
[05/11/23 19:29:40] WARNING  Invalid line 43 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000274_1.jpg: Line polygon outside of image bounds                                                                   arrow_dataset.py:57
[05/11/23 19:31:27] WARNING  Invalid line 2 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000165_1.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:37:08] WARNING  Invalid line 3 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000062_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:37:59] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000421_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:38:16] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000405_1.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:38:29] WARNING  Invalid line 8 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/1_000353_1.jpg: Line polygon outside of image bounds                                                                  arrow_dataset.py:57
[05/11/23 19:39:35] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000414_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:49:12] WARNING  Invalid line 34 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000035_1.jpg: Line polygon outside of image bounds                                                                   arrow_dataset.py:57
[05/11/23 19:50:20] WARNING  Invalid line 5 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000013_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:52:41] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000052_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
                    WARNING  Invalid line 1 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000052_2.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
[05/11/23 19:53:16] WARNING  Invalid line 20 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000190_2.jpg: Line polygon outside of image bounds                                                                   arrow_dataset.py:57
[05/11/23 19:56:40] WARNING  Invalid line 0 in /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/page/000330_1.jpg: Line polygon outside of image bounds                                                                    arrow_dataset.py:57
Extracting lines ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  83% 99248/120244 -:--:-- -:--:--
Output file written to /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/list-20230511.arrow

real    164m10,150s
user    1292m16,227s
sys     11m45,785s
(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Handschriften$ export OMP_NUM_THREADS=1
(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Handschriften$ mkdir 20230512
(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Handschriften$ time ketos train -d cuda:0 -f binary -i /home/stweil/.config/kraken/digitue_best.mlmodel -o 20230512/german_handwriting --resize add -r 0.0002 
--precision 16 --batch-size 4 /data/stweil/HTR_Validation_Set_StAZH_RRB_German_Kurrent_XIX/list-20230511.arrow 
scikit-learn version 1.2.2 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
You are using a CUDA device ('NVIDIA RTX A5000') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
[05/12/23 07:00:11] WARNING  Neural network has been trained on mode 1 images, training set contains mode L data. Consider setting `force_binarization`                                                   train.py:588
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
┏━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃    ┃ Name      ┃ Type                     ┃ Params ┃                 In sizes ┃                Out sizes ┃
┡━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 0  │ val_cer   │ CharErrorRate            │      0 │                        ? │                        ? │
│ 1  │ val_wer   │ WordErrorRate            │      0 │                        ? │                        ? │
│ 2  │ net       │ MultiParamSequential     │  4.1 M │  [[1, 1, 120, 400], '?'] │   [[1, 295, 1, 50], '?'] │
│ 3  │ net.C_0   │ ActConv2D                │  1.3 K │  [[1, 1, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 4  │ net.Do_1  │ Dropout                  │      0 │ [[1, 32, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 5  │ net.Mp_2  │ MaxPool                  │      0 │ [[1, 32, 120, 400], '?'] │  [[1, 32, 60, 200], '?'] │
│ 6  │ net.C_3   │ ActConv2D                │ 40.0 K │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 7  │ net.Do_4  │ Dropout                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 8  │ net.Mp_5  │ MaxPool                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 30, 100], '?'] │
│ 9  │ net.C_6   │ ActConv2D                │ 55.4 K │  [[1, 32, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 10 │ net.Do_7  │ Dropout                  │      0 │  [[1, 64, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 11 │ net.Mp_8  │ MaxPool                  │      0 │  [[1, 64, 30, 100], '?'] │   [[1, 64, 15, 50], '?'] │
│ 12 │ net.C_9   │ ActConv2D                │  110 K │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 13 │ net.Do_10 │ Dropout                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 14 │ net.S_11  │ Reshape                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 960, 1, 50], '?'] │
│ 15 │ net.L_12  │ TransposedSummarizingRNN │  1.9 M │   [[1, 960, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 16 │ net.Do_13 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 17 │ net.L_14  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 18 │ net.Do_15 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 19 │ net.L_16  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 20 │ net.Do_17 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 21 │ net.O_18  │ LinSoftmax               │  118 K │   [[1, 400, 1, 50], '?'] │   [[1, 295, 1, 50], '?'] │
└────┴───────────┴──────────────────────────┴────────┴──────────────────────────┴──────────────────────────┘
Trainable params: 4.1 M                                                                                                                                                                                               
Non-trainable params: 0                                                                                                                                                                                               
Total params: 4.1 M                                                                                                                                                                                                   
Total estimated model params size (MB): 16                                                                                                                                                                            
stage 0/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:14 • 0:00:00 13.24it/s val_accuracy: 0.759 val_word_accuracy: 0.375  early_stopping: 0/10 0.75863
stage 1/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:13 • 0:00:00 13.83it/s val_accuracy: 0.81 val_word_accuracy: 0.491  early_stopping: 0/10 0.81021
stage 2/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:07 • 0:00:00 13.69it/s val_accuracy: 0.84 val_word_accuracy: 0.555  early_stopping: 0/10 0.84000
stage 3/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:26:51 • 0:00:00 13.65it/s val_accuracy: 0.848 val_word_accuracy: 0.595  early_stopping: 0/10 0.84794
stage 4/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:13 • 0:00:00 13.58it/s val_accuracy: 0.868 val_word_accuracy: 0.604  early_stopping: 0/10 0.86768
stage 5/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:26:21 • 0:00:00 13.40it/s val_accuracy: 0.875 val_word_accuracy: 0.638  early_stopping: 0/10 0.87478
stage 6/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:13 • 0:00:00 13.93it/s val_accuracy: 0.864 val_word_accuracy: 0.642  early_stopping: 1/10 0.87478
stage 7/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:02 • 0:00:00 13.55it/s val_accuracy: 0.873 val_word_accuracy: 0.658  early_stopping: 2/10 0.87478
stage 8/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:24:51 • 0:00:00 13.47it/s val_accuracy: 0.875 val_word_accuracy: 0.673  early_stopping: 0/10 0.87515
stage 9/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:07 • 0:00:00 13.39it/s val_accuracy: 0.872 val_word_accuracy: 0.666  early_stopping: 1/10 0.87515
stage 10/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:12 • 0:00:00 13.45it/s val_accuracy: 0.876 val_word_accuracy: 0.677  early_stopping: 0/10 0.87589
stage 11/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:09 • 0:00:00 13.54it/s val_accuracy: 0.893 val_word_accuracy: 0.696  early_stopping: 0/10 0.89255
stage 12/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:10 • 0:00:00 13.40it/s val_accuracy: 0.885 val_word_accuracy: 0.703  early_stopping: 1/10 0.89255
stage 13/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:15 • 0:00:00 13.73it/s val_accuracy: 0.896 val_word_accuracy: 0.704  early_stopping: 0/10 0.89574
stage 14/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:01 • 0:00:00 13.63it/s val_accuracy: 0.898 val_word_accuracy: 0.714  early_stopping: 0/10 0.89777
stage 15/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:11 • 0:00:00 13.57it/s val_accuracy: 0.891 val_word_accuracy: 0.713  early_stopping: 1/10 0.89777
stage 16/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:07 • 0:00:00 12.98it/s val_accuracy: 0.895 val_word_accuracy: 0.723  early_stopping: 2/10 0.89777
stage 17/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:26:55 • 0:00:00 14.07it/s val_accuracy: 0.894 val_word_accuracy: 0.723  early_stopping: 3/10 0.89777
stage 18/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:26:46 • 0:00:00 13.64it/s val_accuracy: 0.898 val_word_accuracy: 0.729  early_stopping: 0/10 0.89831
stage 19/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:07 • 0:00:00 13.76it/s val_accuracy: 0.894 val_word_accuracy: 0.732  early_stopping: 1/10 0.89831
stage 20/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:26:48 • 0:00:00 13.65it/s val_accuracy: 0.902 val_word_accuracy: 0.736  early_stopping: 0/10 0.90213
stage 21/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:27:03 • 0:00:00 13.39it/s val_accuracy: 0.898 val_word_accuracy: 0.733  early_stopping: 1/10 0.90213
stage 22/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:26:59 • 0:00:00 13.65it/s val_accuracy: 0.901 val_word_accuracy: 0.738  early_stopping: 2/10 0.90213
stage 23/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:25:56 • 0:00:00 13.40it/s val_accuracy: 0.904 val_word_accuracy: 0.737  early_stopping: 0/10 0.90373
stage 24/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:25:18 • 0:00:00 15.17it/s val_accuracy: 0.903 val_word_accuracy: 0.731  early_stopping: 1/10 0.90373
stage 25/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:24:53 • 0:00:00 15.22it/s val_accuracy: 0.902 val_word_accuracy: 0.734  early_stopping: 2/10 0.90373
stage 26/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:25:13 • 0:00:00 13.65it/s val_accuracy: 0.897 val_word_accuracy: 0.739  early_stopping: 3/10 0.90373
stage 27/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:25:48 • 0:00:00 15.96it/s val_accuracy: 0.898 val_word_accuracy: 0.742  early_stopping: 4/10 0.90373
stage 28/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:25:13 • 0:00:00 14.02it/s val_accuracy: 0.895 val_word_accuracy: 0.746  early_stopping: 5/10 0.90373
stage 29/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:25:15 • 0:00:00 14.13it/s val_accuracy: 0.9 val_word_accuracy: 0.748  early_stopping: 6/10 0.90373
stage 30/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:25:23 • 0:00:00 13.67it/s val_accuracy: 0.889 val_word_accuracy: 0.725  early_stopping: 7/10 0.90373
stage 31/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:24:45 • 0:00:00 13.76it/s val_accuracy: 0.896 val_word_accuracy: 0.749  early_stopping: 8/10 0.90373
stage 32/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:24:46 • 0:00:00 14.34it/s val_accuracy: 0.903 val_word_accuracy: 0.751  early_stopping: 9/10 0.90373
stage 33/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22331/22331 0:25:09 • 0:00:00 15.40it/s val_accuracy: 0.902 val_word_accuracy: 0.755  early_stopping: 10/10 0.90373
Moving best model 20230512/german_handwriting_23.mlmodel (0.9037277102470398) to 20230512/german_handwriting_best.mlmodel

real    939m21,370s
user    1393m24,389s
sys     45m19,526s

Training – plans for the future

The trainings above where based on the kraken model digitue_best. Future trainings could use the newer model german_print. In addition, more or improved ground truth data sets should be used for the training. Perhaps synthetic data (error free!) could also help.

Clone this wiki locally