rl-falling-blocks Building a few RL implementations on my falling blocks game. Imitation learning Imitation learning algorithm based on my own game play of the game (used as a benchmark). Policy gradient Basic REINFORCE algorithm with average reward baseline for unbiased (high variance) agent.