Course: Artificial Intelligence, University of Tehran
Project Due Date: June 10, 2024
This repository contains the completed work for Project 3 of the Artificial Intelligence course. The project covers Reinforcement Learning, advanced neural networks (LSTM), and search algorithms, with practical implementations and detailed explanations.
Policy Iteration:
- Provided a detailed explanation of Policy Iteration with an example, illustrating all the steps from random policy initialization to convergence.
Deep Q Learning:
- Explained the given Deep Q Learning code line by line.
- Performed hyperparameter optimization to improve the model's performance on the Cartpole game.
FrozenLake Game with Neural Network-Based Q-Learning:
- Implemented a neural network-based Q-learning system to train an agent to navigate safely from the start (S) to the goal (G) in the FrozenLake game.
- Explained the problem, neural network architecture, and results in detail.
Mathematical Intuition Behind LSTM:
- Explained the need for LSTMs as an improvement over traditional RNNs.
- Detailed the forget, input, and output gates of LSTMs using a comprehensive example.
- Walked through the steps of backpropagation in LSTMs for the same example.
RNN, GRU, and LSTM Models:
- Analyzed and explained the provided RNN-GRU-LSTM notebook line by line.
- Improved the model’s performance:
- Increased the R² score in Part I.
- Reduced RMSE for train and test data in Part II.
Informed Search (A*):
- Provided two manual examples and implemented Python scripts for solving problems using the A* search algorithm.
Min-Max Search:
- Explained the tree traversal sequence for Min-Max search with two manual examples.
- Developed Python scripts to demonstrate the process.
Uninformed Search (DFS, BFS, UCS):
- Created Python scripts for Depth First Search (DFS), Breadth First Search (BFS), and Uniform Cost Search (UCS) using example problems.
Constraint Satisfaction Problem:
- Solved a map-coloring problem (four-color theorem) using Python.
- Implemented and explained the approach using a constraint satisfaction framework.