This repository is an implementation of the WorldModelsExperiments combined with forks of gym-duckietown and baselines.
There are three gym environments provided.
DreamDuck-v0
: default environmentdreamduck/envs/env.py
DreamDuck-v1
: world model representationdreamduck/envs/realenv.py
DreamDuck-v2
: dream environmentdreamduck/envs/rnnenv.py
- Create a virtual environment with
python3 -m venv venv
and activate it withsource venv/bin/activate
- If the module is not present run
sudo apt-get install python3-venv
, otherwise make use of instructions for your OS
-
Install dependencies
pip install -r ./dreamduck/envs/requirements.txt
-
Install baselines fork
pip install git+https://github.com/Bassstring/baselines
-
Install this module
pip install -e .
All three environments can be controlled manually:
python dreamduck/envs/env.py
- Use the flag
-h
for all options
python dreamduck/envs/realenv.py
- Show real observation next to world model interpretation with
--debug
python dreamduck/envs/rnnenv.py
- Make use of flag
--temp
to control uncertainty
The baselines module provide a straightforward way of training an agent with different algorithms and settings out of the box.
With the following commands an agent is trained in the dream and evaluated in the real environment interpreted by our world model.
-
python -m baselines.run --alg=ppo2 --env=DreamDuck-v2 --num_timesteps=2e7 --network=mlp --num_env=2 --save_path=./models/dreamduck_rnnenv_ppo2 --log_path=train_rnnenv_logs
-
python -m baselines.run --alg=ppo2 --env=DreamDuck-v1 --network=mlp --num_timesteps=0 --load_path=./models/dreamduck_rnnenv_ppo2 --play
For debugging purposes invoke baselines with following environmental variable:
DEBUG_BASELINES=1 python -m baselines.run --alg=ppo2 ...
- Install xvfb
sudo apt install xvfb -y
- Run
xvfb-run -a -s "-screen 0 1400x900x24 +extension RANDR" -- python -m baselines.run --alg=ppo2 --env=DreamDuck-v0 --num_timesteps=2e7 --network=cnn_lstm --num_env=8 --save_path=./models/dreamduck_cnn_lstm_ppo2 --log_path=train_logs
If there are issues follow this instruction.
https://docs.google.com/presentation/d/1wxcVQcTnhOC700dCKF-H1ftDOI84jh5EPQZgoMzgHvc/edit?usp=sharing
Paper expalining this work: http://tiny.cc/xq8ucz
P.S. The paper is for explaining the work in detail while the presentation is for showing decent results.