Initial commit

heatz123 · Jul 14, 2024 · a2a9df7 · a2a9df7
commit a2a9df7
Show file tree

Hide file tree

Showing 3,063 changed files with 629,041 additions and 0 deletions.
diff --git a/LICENSE_garage b/LICENSE_garage
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2019 Reinforcement Learning Working Group
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
@@ -0,0 +1,95 @@
+# TLDR: Temporal Distance-aware Representation for Unsupervised Goal-Conditioned RL
+
+This repository contains the official implementation of [TLDR: Unsupervised Goal-Conditioned RL via Temporal Distance-Aware Representations](https://heatz123.github.io/tldr/) by [Junik Bae](https://heatz123.github.io/), [Kwanyoung Park](https://kwanyoungpark.github.io/) and [Youngwoon Lee](https://youngwoon.github.io/).
+
+## Requirements
+- Python 3.8
+
+## Installation
+
+```
+conda create --name tldr python=3.8
+conda activate tldr
+conda install pytorch==2.0.1 pytorch-cuda=11.8 patchelf -c pytorch -c nvidia
+pip install -r requirements.txt --no-deps
+pip install -e .
+pip install -e garaged
+pip install -e d4rl
+```
+
+## Commands to run experiments
+
+### Ant
+```
+# TLDR:
+python tests/main.py --env ant --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type preset --eval_plot_axis -50 50 -50 50 --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo tldr --discrete 0 --dim_option 4 --goal_reaching 1 --eval_plot_axis -80 80 -80 80 --trans_minibatch_size 1024 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
+# METRA:
+python tests/main.py --env ant --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type preset --eval_plot_axis -50 50 -50 50 --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 0 --trans_minibatch_size 1024 --dim_option 2 --description "metra"
+```
+
+### HalfCheetah
+```
+# TLDR:
+python tests/main.py --env half_cheetah --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type preset --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo tldr --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
+# METRA:
+python tests/main.py --env half_cheetah --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type preset --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 1 --dim_option 16 --trans_minibatch_size 1024 --description "metra"
+```
+
+### AntMaze-Large
+```
+# TLDR:
+python tests/main.py --env antmaze-large-play --max_path_length 300 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 75 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo tldr --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
+# METRA:
+python tests/main.py --env antmaze-large-play --max_path_length 300 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 75 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --description "metra"
+```
+
+### AntMaze-Ultra
+```
+# TLDR:
+python tests/main.py --env antmaze-ultra-play --max_path_length 600 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 150 --n_epochs_per_log 100 --n_epochs_per_eval 500 --n_epochs_per_save 500 --n_epochs_per_pt_save 500 --sac_max_buffer_size 1000000 --algo tldr --goal_reaching 1 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
+# METRA:
+python tests/main.py --env antmaze-ultra-play --max_path_length 600 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 150 --n_epochs_per_log 100 --n_epochs_per_eval 500 --n_epochs_per_save 500 --n_epochs_per_pt_save 500 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --description "metra"
+```
+
+### Quadruped-Escape
+```
+# TLDR:
+python tests/main.py --env dmc_quadruped_state_escape --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo tldr --goal_reaching 1 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --eval_plot_axis -15 15 -15 15 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
+# METRA:
+python tests/main.py --env dmc_quadruped_state_escape --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --eval_plot_axis -15 15 -15 15 --description "metra"
+```
+
+### Humanoid-Run
+```
+# TLDR:
+python tests/main.py --env dmc_humanoid_state --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo tldr --goal_reaching 1 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --eval_plot_axis -60 60 -60 60 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
+# METRA:
+python tests/main.py --env dmc_humanoid_state --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --eval_plot_axis -60 60 -60 60 --description "metra"
+```
+
+### Quadruped (Pixel)
+```
+# TLDR:
+python tests/main.py --env dmc_quadruped --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --video_skip_frames 2 --frame_stack 3 --sac_max_buffer_size 300000 --eval_plot_axis -15 15 -15 15 --algo tldr --goal_reaching 1 --discrete 0 --dim_option 4 --trans_optimization_epochs 200 --n_epochs_per_log 25 --n_epochs_per_eval 125 --n_epochs_per_save 125 --n_epochs_per_pt_save 125 --encoder 1 --sample_cpu 0 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
+# METRA:
+python tests/main.py --env dmc_quadruped --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --video_skip_frames 2 --frame_stack 3 --sac_max_buffer_size 300000 --eval_plot_axis -15 15 -15 15 --algo metra --goal_reaching 0 --discrete 0 --dim_option 4 --trans_optimization_epochs 200 --n_epochs_per_log 25 --n_epochs_per_eval 125 --n_epochs_per_save 125 --n_epochs_per_pt_save 125 --encoder 1 --sample_cpu 0 --description "metra"
+```
+
+### Kitchen (Pixel)
+```
+# TLDR:
+python tests/main.py --env kitchen --max_path_length 50 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --num_video_repeats 1 --frame_stack 3 --sac_max_buffer_size 100000 --algo tldr --goal_reaching 1 --discrete 0 --dim_option 4 --sac_lr_a -1 --trans_optimization_epochs 100 --n_epochs_per_log 25 --n_epochs_per_eval 250 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --encoder 1 --sample_cpu 0 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
+# METRA:
+python tests/main.py --env kitchen --max_path_length 50 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --num_video_repeats 1 --frame_stack 3 --sac_max_buffer_size 100000 --algo metra --goal_reaching 0 --discrete 1 --dim_option 24 --sac_lr_a -1 --trans_optimization_epochs 100 --n_epochs_per_log 25 --n_epochs_per_eval 250 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --encoder 1 --sample_cpu 0 --description "metra"
+```
+
+
+If you use this code for your research, please consider citing our paper:
+```
+@article{bae2024tldr,
+  title={TLDR: Unsupervised Goal-Conditioned RL via Temporal Distance-Aware Representations},
+  author={Junik Bae and Kwanyoung Park and Youngwoon Lee},
+  journal={arXiv preprint arXiv:2407.08464},
+  year={2024}
+}
+```
diff --git a/d4rl/.github/FUNDING.yml b/d4rl/.github/FUNDING.yml
@@ -0,0 +1 @@
+github: Farama-Foundation
diff --git a/d4rl/.github/ISSUE_TEMPLATE/bug.md b/d4rl/.github/ISSUE_TEMPLATE/bug.md
@@ -0,0 +1,26 @@
+---
+name: Bug Report
+about: Submit a bug report
+title: "[Bug Report] Bug title"
+---
+
+If you are submitting a bug report, please fill in the following details and use the tag [bug].
+
+**Describe the bug**
+A clear and concise description of what the bug is.
+
+**Code example**
+Please try to provide a minimal example to reproduce the bug. Error messages and stack traces are also helpful.
+
+**System Info**
+Describe the characteristic of your environment:
+ * Describe how `D4RL` was installed (pip, docker, source, ...)
+ * What OS/version of Linux you're using. Note that while we will accept PRs to improve Window's support, we do not officially support it.
+ * Python version
+
+**Additional context**
+Add any other context about the problem here.
+
+### Checklist
+
+- [ ] I have checked that there is no similar [issue](https://github.com/Farama-Foundation/d4rl/issues) in the repo (**required**)
diff --git a/d4rl/.github/ISSUE_TEMPLATE/proposal.md b/d4rl/.github/ISSUE_TEMPLATE/proposal.md
@@ -0,0 +1,31 @@
+---
+name: Proposal
+about: Propose changes that are not fixes bugs
+title: "[Proposal] Proposal title"
+---
+
+### Proposal 
+
+A clear and concise description of the proposal.
+
+### Motivation
+
+Please outline the motivation for the proposal.
+Is your feature request related to a problem? e.g.,"I'm always frustrated when [...]".
+If this is related to another GitHub issue, please link here too.
+
+### Pitch
+
+A clear and concise description of what you want to happen.
+
+### Alternatives
+
+A clear and concise description of any alternative solutions or features you've considered, if any.
+
+### Additional context
+
+Add any other context or screenshots about the feature request here.
+
+### Checklist
+
+- [ ] I have checked that there is no similar [issue](https://github.com/Farama-Foundation/d4rl/issues) in the repo (**required**)
diff --git a/d4rl/.github/ISSUE_TEMPLATE/question.md b/d4rl/.github/ISSUE_TEMPLATE/question.md
@@ -0,0 +1,14 @@
+---
+name: Question
+about: Ask a question
+title: "[Question] Question title"
+---
+
+### Question
+
+If you're a beginner and have basic questions, please ask on [r/reinforcementlearning](https://www.reddit.com/r/reinforcementlearning/) 
+or in the [RL Discord](https://discord.com/invite/xhfNqQv) (if you're new please use the beginners channel). 
+Basic questions that are not bugs or feature requests will be closed without reply, 
+because GitHub issues are not an appropriate venue for these.
+
+Advanced/nontrivial questions, especially in areas where documentation is lacking, are very much welcome.
diff --git a/d4rl/.github/PULL_REQUEST_TEMPLATE.md b/d4rl/.github/PULL_REQUEST_TEMPLATE.md
@@ -0,0 +1,48 @@
+# Description
+
+Please include a summary of the change and which issue is fixed. 
+Please also include relevant motivation and context. 
+List any dependencies that are required for this change.
+
+Fixes # (issue)
+
+## Type of change
+
+Please delete options that are not relevant.
+
+- [ ] Bug fix (non-breaking change which fixes an issue)
+- [ ] New feature (non-breaking change which adds functionality)
+- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
+- [ ] This change requires a documentation update
+
+### Screenshots
+Please attach before and after screenshots of the change if applicable.
+
+<!--
+Example:
+
+| Before | After |
+| ------ | ----- |
+| _gif/png before_ | _gif/png after_ |
+
+
+To upload images to a PR -- simply drag and drop an image while in edit mode, and it should upload the image directly. 
+You can then paste that source into the above before/after sections.
+-->
+
+# Checklist:
+
+- [ ] I have run the [`pre-commit` checks](https://pre-commit.com/) with `pre-commit run --all-files` (see `CONTRIBUTING.md` instructions to set it up)
+- [ ] I have commented my code, particularly in hard-to-understand areas
+- [ ] I have made corresponding changes to the documentation
+- [ ] My changes generate no new warnings
+- [ ] I have added tests that prove my fix is effective or that my feature works
+- [ ] New and existing unit tests pass locally with my changes
+
+<!--
+As you go through the checklist above, you can mark something as done by putting an x character in it
+
+For example,
+- [x] I have done this task
+- [ ] I have not done this task
+-->
diff --git a/d4rl/.github/stale.yml b/d4rl/.github/stale.yml
@@ -0,0 +1,62 @@
+# Configuration for probot-stale - https://github.com/probot/stale
+
+# Number of days of inactivity before an Issue or Pull Request becomes stale
+daysUntilStale: 60
+
+# Number of days of inactivity before an Issue or Pull Request with the stale label is closed.
+# Set to false to disable. If disabled, issues still need to be closed manually, but will remain marked as stale.
+daysUntilClose: 14
+
+# Only issues or pull requests with all of these labels are check if stale. Defaults to `[]` (disabled)
+onlyLabels:
+  - more-information-needed
+
+# Issues or Pull Requests with these labels will never be considered stale. Set to `[]` to disable
+exemptLabels:
+  - pinned
+  - security
+  - "[Status] Maybe Later"
+
+# Set to true to ignore issues in a project (defaults to false)
+exemptProjects: true
+
+# Set to true to ignore issues in a milestone (defaults to false)
+exemptMilestones: true
+
+# Set to true to ignore issues with an assignee (defaults to false)
+exemptAssignees: true
+
+# Label to use when marking as stale
+staleLabel: stale
+
+# Comment to post when marking as stale. Set to `false` to disable
+markComment: >
+  This issue has been automatically marked as stale because it has not had
+  recent activity. It will be closed if no further activity occurs. Thank you
+  for your contributions.
+
+# Comment to post when removing the stale label.
+# unmarkComment: >
+#   Your comment here.
+
+# Comment to post when closing a stale Issue or Pull Request.
+# closeComment: >
+#   Your comment here.
+
+# Limit the number of actions per hour, from 1-30. Default is 30
+limitPerRun: 30
+
+# Limit to only `issues` or `pulls`
+only: issues
+
+# Optionally, specify configuration settings that are specific to just 'issues' or 'pulls':
+# pulls:
+#   daysUntilStale: 30
+#   markComment: >
+#     This pull request has been automatically marked as stale because it has not had
+#     recent activity. It will be closed if no further activity occurs. Thank you
+#     for your contributions.
+
+# issues:
+#   exemptLabels:
+#     - confirmed