A deep reinforcement learning agent implementing the control of a simple two joints robotic arm.
This project was completed as part of the deep reinforcement learning nanodegree.
In this task a double-jointed arm should move to target locations. The following is the description of the reward function together with the observation and actions space.
-
reward is +0.1 for each step that the agent's hand is in the goal location
-
observation space consists of 33 variables corresponding to position, rotation, velocity, angular velocities.
-
action space is a vector with four numbers, corresponding to torque applicable to two 2 degree of freedom joints. Every entry in the action vector should be a number between -1 and 1.
The task is episodic.
In order to solve the environment the agent must get an average score of +30 over 100 consecutive episodes.
The code is organized in the following files:
- model.py: this file implements the neural networks which are used by the DDPG agent, both the actor and critic
- ddpg_agent.py: this file implements the DDPG algorithm
- reacher.py: this file includes some utility functions used to orchestrate the agent and the environment with the objective of cleaning up the report file
A working python 3 environment is required. You can easily setup one installing [anaconda] (https://www.anaconda.com/download/)
It is suggested to create a new environment as follows:
conda create --name reacher python=3.6
activate the environment
source activate reacher
start the jupyter server
python jupyter-notebook --no-browser --ip 127.0.0.1 --port 8888 --port-retries=0
- Download the pre-compiled unity environment Linux: click here Mac OSX: click here Windows (32-bit): click here Windows (64-bit): click here
- Decompress the archive at your preferred location (e.g. in this repository working copy)
- Open Report.ipynb notebook
- Insert your path to the pre-compiled unity environment to allow the notebook to run it
- Run the Report.ipynb to install all remaining dependencies and explore my project work