Building the classic snake game using python pygame and making it intelligent with reinforcement learning
Project mentor - Shubham Lohiya
Details about each stage can be found in the respective folders.
Making a basic snake game, no reinforcement learning involved. The objective is to collect as many apples as possible without making the snake crash into the walls or itself. The game controls are the arrow keys.
I've recreated the snake game by Clear Code which you can find on GitHub, here and on his YouTube channel, here.
Implementing the value iteration algorithm to solve MDPs (Markov Decision Processes) and formulating a maze as an MDP (the current implementation is quite inefficient and memory heavy but is logically correct). This is a solution to this project.
An RL agent attempts to navigate a windy gridworld and finds the shortest path from start to finish. Different algorithms are applied to solve this problem and their performance is graphically compared, namely, SARSA, Q-Learning and Expected SARSA. This is a solution to a problem on page 152, Reinforcement Learning by Barto and Sutton.
This is a solution to the original problem statement of this project. It uses SARSA, Q-Learning or Expected SARSA to train a snake to earn high rewards in the snake game. While this approach performs quite well (with a maximum score of 58 fruits on one trial) there are still some limitations such as the snake's inability to detect that it is entering a closed loop and trapping itself. This code allows both training and graphical evaluation of the snake's performance.
I am trying to learn about Deep Reinforcement Learning throught Stanford's CS231n. I hope to explore Deep RL approaches to this and other games in the future.
This project uses python 3.8.8 along with numpy, matplotlib and pygame. To install the external libraries using pip run the following code.
pip install pygame
pip install numpy
pip install matplotlib
I want to thank my mentor Shubham Lohiya for his constant guidance throughout this project despite his busy schedule. The project resources created by him helped me pace my learning and gave a good mix of theoretical and practical exposure. This is my first time working on a coding project and his support has definitely helped me in overcoming my aversion to and fear of coding. He was always there to answer our doubts and review our code, giving us suggestions on code presentation and style as well as correcting logical flaws. I also want to thank IIT Bombay for giving us this opportunity to explore programming in Summer of Code, with such interesting coding tasks and enthusiastic mentors.
Basic snake game demo
The project timeline as well as links to learning resources can be found at Project Resources
Windy gridworld demo
Snake training in progress