A minimal implementation (model.py is 331 lines) of Bidirectional Encoder Representations from Transformers (BERT) forked from Andrej Karpathy's minGPT.
Trained on the Shakespeare Corpus on a M2 Macbook for a couple of hours:
O God, O God! O God! God! and you do you! boy! hast's servant thee! hoo! and you sink, and see 'twixt out father sin! God, sir. But your sheps; but you did bot see it and your holy poison can you show the town, which? your minds is itself and be out one this: my lord, but this am that I am no brother, such an outs on me! I do allow him. But trouble me. And makEs the tyranny: more are thou the stoody! When is a world, bold you, give it in thyself, but not he's time. Witch, thou tell'st'st a sleepes on tediabl