Multi-Agent Constrained Policy Optimisation (MACPO; MAPPO-L).
-
Updated
Apr 17, 2024 - Python
Multi-Agent Constrained Policy Optimisation (MACPO; MAPPO-L).
Policy Optimization with Penalized Point Probability Distance: an Alternative to Proximal Policy Optimization
Implementation of a Deep Reinforcement Learning algorithm, Proximal Policy Optimization (SOTA), on a continuous action space openai gym (Box2D/Car Racing v0)
Mirror Descent Policy Optimization
Model-based Policy Gradients
Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)
Codebase to fully reproduce the results of "No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO" (Moalla et al. 2024). Uses TorchRL and provides extensive tools for studying representation dynamics in policy optimization.
This repository contains the code for the paper "Local policy search with Bayesian optimization".
Code for Policy Optimization as Online Learning with Mediator Feedback
An implementation of the reinforcement learning for CartPole-v0 by policy optimization
This repository contains the code for the NeurIPS 2021 submission "Local policy search with Bayesian optimization".
Reinforcement Learning (RL) 🤖! This repository is your hands-on guide to implementing RL algorithms, from Markov Decision Processes (MDPs) to advanced methods like PPO and DDPG. 🚀 Build smart agents, learn the math behind policies, and experiment with real-world applications! 🔥💡
This repo implements the REINFORCE algorithm for solving the Cart Pole V1 environment of the Gymnasium library using Python 3.8 and PyTorch 2.0.1.
Add a description, image, and links to the policy-optimization topic page so that developers can more easily learn about it.
To associate your repository with the policy-optimization topic, visit your repo's landing page and select "manage topics."