Skip to content

Training German Handwriting

Stefan Weil edited this page Dec 5, 2022 · 14 revisions

Training of kraken model for German handwriting

Ground Truth

Tobias Grüning, Gundram Leifert, Johannes Michael, Tobias Strauß, Max Weidemann, Roger Labahn. (2016). read_dataset_german_konzilsprotokolle [Data set]. Zenodo. http://doi.org/10.5281/zenodo.215383

Sánchez, Joan Andreu, Romero, Verónica, Toselli, Alejandro H., & Vidal, Enrique. (2016). READ dataset Bozen [Data set]. Zenodo. https://doi.org/10.5281/zenodo.218236

Hodel, Tobias, Schoch, David, & Dängeli, Peter. (2021). Handwritten Text Recognition Ground Truth Set: StABS Ratsbücher O10, Urfehdenbuch X (1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.5153263

Preparing data for training

wget -m https://zenodo.org/record/215383/files/german_konzilsprotokolle.tar.gz
tar xzf zenodo.org/record/215383/files/german_konzilsprotokolle.tar.gz
(
cd gt/215383/...
for dir in Copy_of_*; do (cd $dir; ln -sv page/*.xml .); done
)

wget -m https://zenodo.org/record/218236/files/PublicData.tgz
tar xzf zenodo.org/record/218236/files/PublicData.tgz

wget -m https://zenodo.org/record/5153263/files/StABS_Ratsbuch_O_10.zip
unzip zenodo.org/record/5153263/files/StABS_Ratsbuch_O_10.zip 
cd gt/5153263/StABS_Ratsbuch_O_10/page
ln -sv ../*.jpg ../*.png .
for in in *.jpg; do out=$(echo $in|sed s/Rats.*_0*//); mv -v $in $out; done
ls gt/215383/german_konzilsprotokolle/data/Greifswald_Alvermann/Copy_of_*/0*.xml >>list.train
ls gt/218236/PublicData/*/page/*xml >>list.train 
ls gt/5153263/StABS_Ratsbuch_O_10/page/*.xml >>list.train 
ls digitue/*/*xml >> list.train

shuf < list.train | shuf >list1.train

Training with small dataset (Konzilsprotokolle)

ketos train -d cuda:0 --workers 4 -f xml Handschriften/gt/215383/german_konzilsprotokolle/data/Greifswald_Alvermann/Copy_of_*/0*.xml

about 24 min / epoch

Pretraining with large dataset

(venv3.9) stweil@ocr-02:~/src/github/mittagessen/kraken/Handschriften$ ketos pretrain -d cuda:0 -f page -t list.shuf.train -o pretrain/german_handwriting
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
┏━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃    ┃ Name                   ┃ Type                     ┃ Params ┃
┡━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ 0  │ net                    │ MultiParamSequential     │  4.0 M │
│ 1  │ net.C_0                │ ActConv2D                │  1.3 K │
│ 2  │ net.Do_1               │ Dropout                  │      0 │
│ 3  │ net.Mp_2               │ MaxPool                  │      0 │
│ 4  │ net.C_3                │ ActConv2D                │ 40.0 K │
│ 5  │ net.Do_4               │ Dropout                  │      0 │
│ 6  │ net.Mp_5               │ MaxPool                  │      0 │
│ 7  │ net.C_6                │ ActConv2D                │ 55.4 K │
│ 8  │ net.Do_7               │ Dropout                  │      0 │
│ 9  │ net.Mp_8               │ MaxPool                  │      0 │
│ 10 │ net.C_9                │ ActConv2D                │  110 K │
│ 11 │ net.Do_10              │ Dropout                  │      0 │
│ 12 │ net.S_11               │ Reshape                  │      0 │
│ 13 │ net.L_12               │ TransposedSummarizingRNN │  1.9 M │
│ 14 │ net.Do_13              │ Dropout                  │      0 │
│ 15 │ net.L_14               │ TransposedSummarizingRNN │  963 K │
│ 16 │ net.Do_15              │ Dropout                  │      0 │
│ 17 │ net.L_16               │ TransposedSummarizingRNN │  963 K │
│ 18 │ net.Do_17              │ Dropout                  │      0 │
│ 19 │ features               │ MultiParamSequential     │  207 K │
│ 20 │ wav2vec2mask           │ Wav2Vec2Mask             │  388 K │
│ 21 │ wav2vec2mask.mask_emb  │ Embedding                │  3.8 K │
│ 22 │ wav2vec2mask.project_q │ Linear                   │  384 K │
│ 23 │ encoder                │ MultiParamSequential     │  3.8 M │
└────┴────────────────────────┴──────────────────────────┴────────┘
Trainable params: 4.4 M                                                                                                                                                                                               
Non-trainable params: 0                                                                                                                                                                                               
Total params: 4.4 M                                                                                                                                                                                                   
Total estimated model params size (MB): 17                                                                                                                                                                            
Validation Sanity Check ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/66 -:--:-- 0:00:00   

Training with large dataset

ketos train -d cuda:0 -f xml -i /home/stweil/.config/kraken/digitue_best.mlmodel -t list.shuf.train -o 202211261525/german_handwriting --resize add -r 0.0001

                    WARNING  Text line "" is empty after transformations                                                                                                                                  train.py:361
[...]
                    WARNING  Text line "" is empty after transformations                                                                                                                                  train.py:361
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
┏━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃    ┃ Name      ┃ Type                     ┃ Params ┃                 In sizes ┃                Out sizes ┃
┡━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 0  │ net       │ MultiParamSequential     │  4.1 M │  [[1, 1, 120, 400], '?'] │   [[1, 279, 1, 50], '?'] │
│ 1  │ net.C_0   │ ActConv2D                │  1.3 K │  [[1, 1, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 2  │ net.Do_1  │ Dropout                  │      0 │ [[1, 32, 120, 400], '?'] │ [[1, 32, 120, 400], '?'] │
│ 3  │ net.Mp_2  │ MaxPool                  │      0 │ [[1, 32, 120, 400], '?'] │  [[1, 32, 60, 200], '?'] │
│ 4  │ net.C_3   │ ActConv2D                │ 40.0 K │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 5  │ net.Do_4  │ Dropout                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 60, 200], '?'] │
│ 6  │ net.Mp_5  │ MaxPool                  │      0 │  [[1, 32, 60, 200], '?'] │  [[1, 32, 30, 100], '?'] │
│ 7  │ net.C_6   │ ActConv2D                │ 55.4 K │  [[1, 32, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 8  │ net.Do_7  │ Dropout                  │      0 │  [[1, 64, 30, 100], '?'] │  [[1, 64, 30, 100], '?'] │
│ 9  │ net.Mp_8  │ MaxPool                  │      0 │  [[1, 64, 30, 100], '?'] │   [[1, 64, 15, 50], '?'] │
│ 10 │ net.C_9   │ ActConv2D                │  110 K │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 11 │ net.Do_10 │ Dropout                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 64, 15, 50], '?'] │
│ 12 │ net.S_11  │ Reshape                  │      0 │   [[1, 64, 15, 50], '?'] │   [[1, 960, 1, 50], '?'] │
│ 13 │ net.L_12  │ TransposedSummarizingRNN │  1.9 M │   [[1, 960, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 14 │ net.Do_13 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 15 │ net.L_14  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 16 │ net.Do_15 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 17 │ net.L_16  │ TransposedSummarizingRNN │  963 K │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 18 │ net.Do_17 │ Dropout                  │      0 │   [[1, 400, 1, 50], '?'] │   [[1, 400, 1, 50], '?'] │
│ 19 │ net.O_18  │ LinSoftmax               │  111 K │   [[1, 400, 1, 50], '?'] │   [[1, 279, 1, 50], '?'] │
└────┴───────────┴──────────────────────────┴────────┴──────────────────────────┴──────────────────────────┘
Trainable params: 4.1 M                                                                                                                                                                                               
Non-trainable params: 0                                                                                                                                                                                               
Total params: 4.1 M                                                                                                                                                                                                   
Total estimated model params size (MB): 16                                                                                                                                                                            
stage 0/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:30:54 val_accuracy: 0.67250  early_stopping: 0/5 0.67250
stage 1/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:30:29 val_accuracy: 0.77690  early_stopping: 0/5 0.77690
stage 2/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:30:37 val_accuracy: 0.82208  early_stopping: 0/5 0.82208
stage 3/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:31:23 val_accuracy: 0.84496  early_stopping: 0/5 0.84496
stage 4/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:32:29 val_accuracy: 0.86469  early_stopping: 0/5 0.86469
stage 5/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:31:10 val_accuracy: 0.87568  early_stopping: 0/5 0.87568
stage 6/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:31:30 val_accuracy: 0.88534  early_stopping: 0/5 0.88534
stage 7/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:31:01 val_accuracy: 0.89373  early_stopping: 0/5 0.89373
stage 8/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:30:58 val_accuracy: 0.89676  early_stopping: 0/5 0.89676
stage 9/∞  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:27:53 val_accuracy: 0.90261  early_stopping: 0/5 0.90261
stage 10/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:14 val_accuracy: 0.90775  early_stopping: 0/5 0.90775
stage 11/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:10 val_accuracy: 0.90983  early_stopping: 0/5 0.90983
stage 12/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:22 val_accuracy: 0.91347  early_stopping: 0/5 0.91347
stage 13/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:57 val_accuracy: 0.91408  early_stopping: 0/5 0.91408
stage 14/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:11 val_accuracy: 0.91815  early_stopping: 0/5 0.91815
stage 15/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:22:18 val_accuracy: 0.91959  early_stopping: 0/5 0.91959
stage 16/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:47 val_accuracy: 0.92043  early_stopping: 0/5 0.92043
stage 17/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:28 val_accuracy: 0.92369  early_stopping: 0/5 0.92369
stage 18/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:58 val_accuracy: 0.92322  early_stopping: 1/5 0.92369
stage 19/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:21:10 val_accuracy: 0.92586  early_stopping: 0/5 0.92586
stage 20/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:12 val_accuracy: 0.76328  early_stopping: 1/5 0.92586
stage 21/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:32 val_accuracy: 0.92818  early_stopping: 0/5 0.92818
stage 22/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:05 val_accuracy: 0.92855  early_stopping: 0/5 0.92855
stage 23/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:57 val_accuracy: 0.92989  early_stopping: 0/5 0.92989
stage 24/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:08 val_accuracy: 0.93016  early_stopping: 0/5 0.93016
stage 25/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:04 val_accuracy: 0.93197  early_stopping: 0/5 0.93197
stage 26/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:22:27 val_accuracy: 0.93215  early_stopping: 0/5 0.93215
stage 27/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:59 val_accuracy: 0.93361  early_stopping: 0/5 0.93361
stage 28/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:20:37 val_accuracy: 0.93408  early_stopping: 0/5 0.93408
stage 29/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:35 val_accuracy: 0.93470  early_stopping: 0/5 0.93470
stage 30/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:24 val_accuracy: 0.93382  early_stopping: 1/5 0.93470
stage 31/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:22:34 val_accuracy: 0.93630  early_stopping: 0/5 0.93630
stage 32/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:10 val_accuracy: 0.92988  early_stopping: 1/5 0.93630
stage 33/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:50 val_accuracy: 0.93543  early_stopping: 2/5 0.93630
stage 34/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:06 val_accuracy: 0.81099  early_stopping: 3/5 0.93630
stage 35/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:22:24 val_accuracy: 0.93757  early_stopping: 0/5 0.93757
stage 36/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:32:17 val_accuracy: 0.93726  early_stopping: 1/5 0.93757
stage 37/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:43 val_accuracy: 0.93821  early_stopping: 0/5 0.93821
stage 38/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:24:05 val_accuracy: 0.93877  early_stopping: 0/5 0.93877
stage 39/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:22:21 val_accuracy: 0.93811  early_stopping: 1/5 0.93877
stage 40/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:15 val_accuracy: 0.93799  early_stopping: 2/5 0.93877
stage 41/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 5:54:09 val_accuracy: 0.93705  early_stopping: 3/5 0.93877
stage 42/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:23:33 val_accuracy: 0.93911  early_stopping: 0/5 0.93911
stage 43/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 5:54:42 val_accuracy: 0.93913  early_stopping: 0/5 0.93913
stage 44/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 4:15:59 val_accuracy: 0.93914  early_stopping: 0/5 0.93914
stage 45/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 7:06:10 val_accuracy: 0.94090  early_stopping: 0/5 0.94090
stage 46/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 7:35:41 val_accuracy: 0.93956  early_stopping: 1/5 0.94090
stage 47/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 4:49:43 val_accuracy: 0.94088  early_stopping: 2/5 0.94090
stage 48/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 3:21:49 val_accuracy: 0.94045  early_stopping: 3/5 0.94090
stage 49/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 4:40:36 val_accuracy: 0.94002  early_stopping: 4/5 0.94090
stage 50/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33797/33797 0:00:00 7:16:09 val_accuracy: 0.93965  early_stopping: 5/5 0.94090
Moving best model 202211261525/german_handwriting_45.mlmodel (0.940900444984436) to 202211261525/german_handwriting_best.mlmodel

real    11685m35,360s
user    20753m55,042s
sys     27866m56,116s
Clone this wiki locally