It is a MXNet implementation of Lookahead Optimizer.
The link of the paper: https://arxiv.org/abs/1907.08610
Import optimizer.py
, then add the prefix Lookahead
before the name of arbitrary optimizer.
import optimizer
optimizer.LookaheadSGD(k=5, alpha=0.5, learning_rate=1e-3)
python mnist.py --optimizer sgd --seed 42
python mnist.py --optimizer lookaheadsgd --seed 42