Visualizing_RL

Visualizing how different agents perceive their environment in the game Snake: Algorithms in Reinforcement Learning

Authors

Yaad Rebhun
Itai Zeitouni

Snake

The snake game environment is a visualization tool to evaluate different RL algorithms on the game Snake. the algorithms need to learn how to guide the snake to the food without hitting the wall or eating itself (self-loop). these algorithms do that by using image processing to create an input vector of values used in determining the best next step for the snake to play. each time the snake eats the food the algorithm gets a reward. the input vector has 3 values. each value defines the next step (forward, left, right). this is defined differently in each algorithm using state/action values.

Project setup

Clone Repository

git clone https://github.com/YaadR/Visualizing_RL.git
cd Visualizing_RL

Create Virtual Environment
```
python3 -m venv venv
```

Activate Environment

# Linux/MacOS
source venv/bin/activate
# Windows
venv/Scripts/activate

Install requirements
```
pip install -r requirements.txt
```
Run Project
```
python3 Ver1\main.py
```

Visualization solutions:

Heatmap
Certainty Arrows
Neural network weights visualization (where NN is used)
Certainty Bar (Entropy based)
State Activation Layer

Reinforcement algorithms & concepts

Agent State Value

Value-based: state value
Model-based
off policy
online

RL Algorithm: $$V(s_t)' = V(s_t) + \alpha* \left[ R_{t+1} + (1-s_{t->terminal})(\gamma* V(s_{t+1}) - V(s_t) \right)]$$

Agent Action Value

Value-based: action value
Model-free
off policy
online

RL Algorithm: $$Q(s_t)' = R_{t+1} + (1-s_{t->terminal})(\gamma* Max(Q(S_{t+1},A_{t+1})))$$

Agent Policy

Policy-based
Model-free
on policy
online

RL Algorithm: Critic: $$A_{\text{critic}}(s_{t})' = R_{t} + (1-s_{t->terminal}) ( \gamma* V(s_{t+1}) - V(s_{t}) )$$

Actor: $$\theta_{\text{actor}} = \nabla_{\theta_{\text{actor}}} \log(\pi_{\theta_{\text{actor}}}(a_{t}|s_{t})) A_{\text{critic}}(s_{t})$$

Training - Stability, Mean & STD - 20 Rounds

Algorithms compete in an Arena

Additional Notes

Please follow the project steps carefully and ensure that all dependencies are correctly installed before running the solution.

Acknowledgments

The basis for this project is inspired by Patrick Loeber for his work in Teach AI To Play Snake - Reinforcement Learning Tutorial With PyTorch And Pygame

License

This project is licensed under the MIT License. Feel free to use, modify, and distribute it according to the terms of the license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Visualizing_RL

Authors

Snake

Project setup

Visualization solutions:

Reinforcement algorithms & concepts

Agent State Value

Agent Action Value

Agent Policy

Training - Stability, Mean & STD - 20 Rounds

Algorithms compete in an Arena

Additional Notes

Acknowledgments

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Visualizing_RL

Authors

Snake

Project setup

Visualization solutions:

Reinforcement algorithms & concepts

Agent State Value

Agent Action Value

Agent Policy

Training - Stability, Mean & STD - 20 Rounds

Algorithms compete in an Arena

Additional Notes

Acknowledgments

License