This project intends to provide a documented and extensible implementation of the A2C and ACKTR algorithms by OpenAI.
Based on the paper by Wu, Mansimov, Liao, Grosse, and Ba (2017): https://arxiv.org/pdf/1708.05144.pdf
Original implementation: https://github.com/openai/baselines
The API documentation and a quickstart guide can be found on Read the Docs.
The following dependencies need to be installed besides TensorFlow and NumPy (click links for further details):
- OpenAI gym. Install with:
$ pip install gym
- KFAC for TensorFlow. You need the latest version (0.1.1), which currently is not hosted on PyPI. Install with:
$ pip install git+https://github.com/tensorflow/kfac
To use the Atari environments you need:
- OpenAI atari-py. Install with:
$ pip install atari-py
- OpenCV for Python. Install with:
$ pip install opencv-python
This project is only tested on Linux with Python 3.6.5.
Run the following to train an Atari model (see a2c_acktr.py for further details):
$ python -m actorcritic.examples.atari.a2c_acktr
If you encounter an InvalidArgumentError 'Received a label value of x which is outside the valid range of [0, x)', restart the program until it works. This is not intended and hopefully will be fixed in the future.
You can visualize the learning progress by launching TensorBoard:
$ tensorboard --logdir ./results/summaries