-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit a2a9df7
Showing
3,063 changed files
with
629,041 additions
and
0 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
MIT License | ||
|
||
Copyright (c) 2019 Reinforcement Learning Working Group | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
# TLDR: Temporal Distance-aware Representation for Unsupervised Goal-Conditioned RL | ||
|
||
This repository contains the official implementation of [TLDR: Unsupervised Goal-Conditioned RL via Temporal Distance-Aware Representations](https://heatz123.github.io/tldr/) by [Junik Bae](https://heatz123.github.io/), [Kwanyoung Park](https://kwanyoungpark.github.io/) and [Youngwoon Lee](https://youngwoon.github.io/). | ||
|
||
## Requirements | ||
- Python 3.8 | ||
|
||
## Installation | ||
|
||
``` | ||
conda create --name tldr python=3.8 | ||
conda activate tldr | ||
conda install pytorch==2.0.1 pytorch-cuda=11.8 patchelf -c pytorch -c nvidia | ||
pip install -r requirements.txt --no-deps | ||
pip install -e . | ||
pip install -e garaged | ||
pip install -e d4rl | ||
``` | ||
|
||
## Commands to run experiments | ||
|
||
### Ant | ||
``` | ||
# TLDR: | ||
python tests/main.py --env ant --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type preset --eval_plot_axis -50 50 -50 50 --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo tldr --discrete 0 --dim_option 4 --goal_reaching 1 --eval_plot_axis -80 80 -80 80 --trans_minibatch_size 1024 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002 | ||
# METRA: | ||
python tests/main.py --env ant --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type preset --eval_plot_axis -50 50 -50 50 --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 0 --trans_minibatch_size 1024 --dim_option 2 --description "metra" | ||
``` | ||
|
||
### HalfCheetah | ||
``` | ||
# TLDR: | ||
python tests/main.py --env half_cheetah --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type preset --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo tldr --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002 | ||
# METRA: | ||
python tests/main.py --env half_cheetah --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type preset --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 1 --dim_option 16 --trans_minibatch_size 1024 --description "metra" | ||
``` | ||
|
||
### AntMaze-Large | ||
``` | ||
# TLDR: | ||
python tests/main.py --env antmaze-large-play --max_path_length 300 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 75 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo tldr --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002 | ||
# METRA: | ||
python tests/main.py --env antmaze-large-play --max_path_length 300 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 75 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --description "metra" | ||
``` | ||
|
||
### AntMaze-Ultra | ||
``` | ||
# TLDR: | ||
python tests/main.py --env antmaze-ultra-play --max_path_length 600 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 150 --n_epochs_per_log 100 --n_epochs_per_eval 500 --n_epochs_per_save 500 --n_epochs_per_pt_save 500 --sac_max_buffer_size 1000000 --algo tldr --goal_reaching 1 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002 | ||
# METRA: | ||
python tests/main.py --env antmaze-ultra-play --max_path_length 600 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 150 --n_epochs_per_log 100 --n_epochs_per_eval 500 --n_epochs_per_save 500 --n_epochs_per_pt_save 500 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --description "metra" | ||
``` | ||
|
||
### Quadruped-Escape | ||
``` | ||
# TLDR: | ||
python tests/main.py --env dmc_quadruped_state_escape --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo tldr --goal_reaching 1 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --eval_plot_axis -15 15 -15 15 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002 | ||
# METRA: | ||
python tests/main.py --env dmc_quadruped_state_escape --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --eval_plot_axis -15 15 -15 15 --description "metra" | ||
``` | ||
|
||
### Humanoid-Run | ||
``` | ||
# TLDR: | ||
python tests/main.py --env dmc_humanoid_state --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo tldr --goal_reaching 1 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --eval_plot_axis -60 60 -60 60 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002 | ||
# METRA: | ||
python tests/main.py --env dmc_humanoid_state --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --eval_plot_axis -60 60 -60 60 --description "metra" | ||
``` | ||
|
||
### Quadruped (Pixel) | ||
``` | ||
# TLDR: | ||
python tests/main.py --env dmc_quadruped --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --video_skip_frames 2 --frame_stack 3 --sac_max_buffer_size 300000 --eval_plot_axis -15 15 -15 15 --algo tldr --goal_reaching 1 --discrete 0 --dim_option 4 --trans_optimization_epochs 200 --n_epochs_per_log 25 --n_epochs_per_eval 125 --n_epochs_per_save 125 --n_epochs_per_pt_save 125 --encoder 1 --sample_cpu 0 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002 | ||
# METRA: | ||
python tests/main.py --env dmc_quadruped --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --video_skip_frames 2 --frame_stack 3 --sac_max_buffer_size 300000 --eval_plot_axis -15 15 -15 15 --algo metra --goal_reaching 0 --discrete 0 --dim_option 4 --trans_optimization_epochs 200 --n_epochs_per_log 25 --n_epochs_per_eval 125 --n_epochs_per_save 125 --n_epochs_per_pt_save 125 --encoder 1 --sample_cpu 0 --description "metra" | ||
``` | ||
|
||
### Kitchen (Pixel) | ||
``` | ||
# TLDR: | ||
python tests/main.py --env kitchen --max_path_length 50 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --num_video_repeats 1 --frame_stack 3 --sac_max_buffer_size 100000 --algo tldr --goal_reaching 1 --discrete 0 --dim_option 4 --sac_lr_a -1 --trans_optimization_epochs 100 --n_epochs_per_log 25 --n_epochs_per_eval 250 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --encoder 1 --sample_cpu 0 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002 | ||
# METRA: | ||
python tests/main.py --env kitchen --max_path_length 50 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --num_video_repeats 1 --frame_stack 3 --sac_max_buffer_size 100000 --algo metra --goal_reaching 0 --discrete 1 --dim_option 24 --sac_lr_a -1 --trans_optimization_epochs 100 --n_epochs_per_log 25 --n_epochs_per_eval 250 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --encoder 1 --sample_cpu 0 --description "metra" | ||
``` | ||
|
||
|
||
If you use this code for your research, please consider citing our paper: | ||
``` | ||
@article{bae2024tldr, | ||
title={TLDR: Unsupervised Goal-Conditioned RL via Temporal Distance-Aware Representations}, | ||
author={Junik Bae and Kwanyoung Park and Youngwoon Lee}, | ||
journal={arXiv preprint arXiv:2407.08464}, | ||
year={2024} | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
github: Farama-Foundation |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
--- | ||
name: Bug Report | ||
about: Submit a bug report | ||
title: "[Bug Report] Bug title" | ||
--- | ||
|
||
If you are submitting a bug report, please fill in the following details and use the tag [bug]. | ||
|
||
**Describe the bug** | ||
A clear and concise description of what the bug is. | ||
|
||
**Code example** | ||
Please try to provide a minimal example to reproduce the bug. Error messages and stack traces are also helpful. | ||
|
||
**System Info** | ||
Describe the characteristic of your environment: | ||
* Describe how `D4RL` was installed (pip, docker, source, ...) | ||
* What OS/version of Linux you're using. Note that while we will accept PRs to improve Window's support, we do not officially support it. | ||
* Python version | ||
|
||
**Additional context** | ||
Add any other context about the problem here. | ||
|
||
### Checklist | ||
|
||
- [ ] I have checked that there is no similar [issue](https://github.com/Farama-Foundation/d4rl/issues) in the repo (**required**) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
--- | ||
name: Proposal | ||
about: Propose changes that are not fixes bugs | ||
title: "[Proposal] Proposal title" | ||
--- | ||
|
||
### Proposal | ||
|
||
A clear and concise description of the proposal. | ||
|
||
### Motivation | ||
|
||
Please outline the motivation for the proposal. | ||
Is your feature request related to a problem? e.g.,"I'm always frustrated when [...]". | ||
If this is related to another GitHub issue, please link here too. | ||
|
||
### Pitch | ||
|
||
A clear and concise description of what you want to happen. | ||
|
||
### Alternatives | ||
|
||
A clear and concise description of any alternative solutions or features you've considered, if any. | ||
|
||
### Additional context | ||
|
||
Add any other context or screenshots about the feature request here. | ||
|
||
### Checklist | ||
|
||
- [ ] I have checked that there is no similar [issue](https://github.com/Farama-Foundation/d4rl/issues) in the repo (**required**) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
--- | ||
name: Question | ||
about: Ask a question | ||
title: "[Question] Question title" | ||
--- | ||
|
||
### Question | ||
|
||
If you're a beginner and have basic questions, please ask on [r/reinforcementlearning](https://www.reddit.com/r/reinforcementlearning/) | ||
or in the [RL Discord](https://discord.com/invite/xhfNqQv) (if you're new please use the beginners channel). | ||
Basic questions that are not bugs or feature requests will be closed without reply, | ||
because GitHub issues are not an appropriate venue for these. | ||
|
||
Advanced/nontrivial questions, especially in areas where documentation is lacking, are very much welcome. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# Description | ||
|
||
Please include a summary of the change and which issue is fixed. | ||
Please also include relevant motivation and context. | ||
List any dependencies that are required for this change. | ||
|
||
Fixes # (issue) | ||
|
||
## Type of change | ||
|
||
Please delete options that are not relevant. | ||
|
||
- [ ] Bug fix (non-breaking change which fixes an issue) | ||
- [ ] New feature (non-breaking change which adds functionality) | ||
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) | ||
- [ ] This change requires a documentation update | ||
|
||
### Screenshots | ||
Please attach before and after screenshots of the change if applicable. | ||
|
||
<!-- | ||
Example: | ||
| Before | After | | ||
| ------ | ----- | | ||
| _gif/png before_ | _gif/png after_ | | ||
To upload images to a PR -- simply drag and drop an image while in edit mode, and it should upload the image directly. | ||
You can then paste that source into the above before/after sections. | ||
--> | ||
|
||
# Checklist: | ||
|
||
- [ ] I have run the [`pre-commit` checks](https://pre-commit.com/) with `pre-commit run --all-files` (see `CONTRIBUTING.md` instructions to set it up) | ||
- [ ] I have commented my code, particularly in hard-to-understand areas | ||
- [ ] I have made corresponding changes to the documentation | ||
- [ ] My changes generate no new warnings | ||
- [ ] I have added tests that prove my fix is effective or that my feature works | ||
- [ ] New and existing unit tests pass locally with my changes | ||
|
||
<!-- | ||
As you go through the checklist above, you can mark something as done by putting an x character in it | ||
For example, | ||
- [x] I have done this task | ||
- [ ] I have not done this task | ||
--> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
# Configuration for probot-stale - https://github.com/probot/stale | ||
|
||
# Number of days of inactivity before an Issue or Pull Request becomes stale | ||
daysUntilStale: 60 | ||
|
||
# Number of days of inactivity before an Issue or Pull Request with the stale label is closed. | ||
# Set to false to disable. If disabled, issues still need to be closed manually, but will remain marked as stale. | ||
daysUntilClose: 14 | ||
|
||
# Only issues or pull requests with all of these labels are check if stale. Defaults to `[]` (disabled) | ||
onlyLabels: | ||
- more-information-needed | ||
|
||
# Issues or Pull Requests with these labels will never be considered stale. Set to `[]` to disable | ||
exemptLabels: | ||
- pinned | ||
- security | ||
- "[Status] Maybe Later" | ||
|
||
# Set to true to ignore issues in a project (defaults to false) | ||
exemptProjects: true | ||
|
||
# Set to true to ignore issues in a milestone (defaults to false) | ||
exemptMilestones: true | ||
|
||
# Set to true to ignore issues with an assignee (defaults to false) | ||
exemptAssignees: true | ||
|
||
# Label to use when marking as stale | ||
staleLabel: stale | ||
|
||
# Comment to post when marking as stale. Set to `false` to disable | ||
markComment: > | ||
This issue has been automatically marked as stale because it has not had | ||
recent activity. It will be closed if no further activity occurs. Thank you | ||
for your contributions. | ||
# Comment to post when removing the stale label. | ||
# unmarkComment: > | ||
# Your comment here. | ||
|
||
# Comment to post when closing a stale Issue or Pull Request. | ||
# closeComment: > | ||
# Your comment here. | ||
|
||
# Limit the number of actions per hour, from 1-30. Default is 30 | ||
limitPerRun: 30 | ||
|
||
# Limit to only `issues` or `pulls` | ||
only: issues | ||
|
||
# Optionally, specify configuration settings that are specific to just 'issues' or 'pulls': | ||
# pulls: | ||
# daysUntilStale: 30 | ||
# markComment: > | ||
# This pull request has been automatically marked as stale because it has not had | ||
# recent activity. It will be closed if no further activity occurs. Thank you | ||
# for your contributions. | ||
|
||
# issues: | ||
# exemptLabels: | ||
# - confirmed |
Oops, something went wrong.