Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
heatz123 committed Jul 14, 2024
0 parents commit a2a9df7
Show file tree
Hide file tree
Showing 3,063 changed files with 629,041 additions and 0 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
21 changes: 21 additions & 0 deletions LICENSE_garage
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2019 Reinforcement Learning Working Group

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
95 changes: 95 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# TLDR: Temporal Distance-aware Representation for Unsupervised Goal-Conditioned RL

This repository contains the official implementation of [TLDR: Unsupervised Goal-Conditioned RL via Temporal Distance-Aware Representations](https://heatz123.github.io/tldr/) by [Junik Bae](https://heatz123.github.io/), [Kwanyoung Park](https://kwanyoungpark.github.io/) and [Youngwoon Lee](https://youngwoon.github.io/).

## Requirements
- Python 3.8

## Installation

```
conda create --name tldr python=3.8
conda activate tldr
conda install pytorch==2.0.1 pytorch-cuda=11.8 patchelf -c pytorch -c nvidia
pip install -r requirements.txt --no-deps
pip install -e .
pip install -e garaged
pip install -e d4rl
```

## Commands to run experiments

### Ant
```
# TLDR:
python tests/main.py --env ant --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type preset --eval_plot_axis -50 50 -50 50 --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo tldr --discrete 0 --dim_option 4 --goal_reaching 1 --eval_plot_axis -80 80 -80 80 --trans_minibatch_size 1024 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
# METRA:
python tests/main.py --env ant --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type preset --eval_plot_axis -50 50 -50 50 --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 0 --trans_minibatch_size 1024 --dim_option 2 --description "metra"
```

### HalfCheetah
```
# TLDR:
python tests/main.py --env half_cheetah --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type preset --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo tldr --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
# METRA:
python tests/main.py --env half_cheetah --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type preset --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 1 --dim_option 16 --trans_minibatch_size 1024 --description "metra"
```

### AntMaze-Large
```
# TLDR:
python tests/main.py --env antmaze-large-play --max_path_length 300 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 75 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo tldr --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
# METRA:
python tests/main.py --env antmaze-large-play --max_path_length 300 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 75 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --description "metra"
```

### AntMaze-Ultra
```
# TLDR:
python tests/main.py --env antmaze-ultra-play --max_path_length 600 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 150 --n_epochs_per_log 100 --n_epochs_per_eval 500 --n_epochs_per_save 500 --n_epochs_per_pt_save 500 --sac_max_buffer_size 1000000 --algo tldr --goal_reaching 1 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
# METRA:
python tests/main.py --env antmaze-ultra-play --max_path_length 600 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 150 --n_epochs_per_log 100 --n_epochs_per_eval 500 --n_epochs_per_save 500 --n_epochs_per_pt_save 500 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --description "metra"
```

### Quadruped-Escape
```
# TLDR:
python tests/main.py --env dmc_quadruped_state_escape --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo tldr --goal_reaching 1 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --eval_plot_axis -15 15 -15 15 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
# METRA:
python tests/main.py --env dmc_quadruped_state_escape --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --eval_plot_axis -15 15 -15 15 --description "metra"
```

### Humanoid-Run
```
# TLDR:
python tests/main.py --env dmc_humanoid_state --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo tldr --goal_reaching 1 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --eval_plot_axis -60 60 -60 60 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
# METRA:
python tests/main.py --env dmc_humanoid_state --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --trans_optimization_epochs 50 --n_epochs_per_log 100 --n_epochs_per_eval 1000 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --sac_max_buffer_size 1000000 --algo metra --goal_reaching 0 --discrete 0 --dim_option 4 --trans_minibatch_size 1024 --eval_plot_axis -60 60 -60 60 --description "metra"
```

### Quadruped (Pixel)
```
# TLDR:
python tests/main.py --env dmc_quadruped --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --video_skip_frames 2 --frame_stack 3 --sac_max_buffer_size 300000 --eval_plot_axis -15 15 -15 15 --algo tldr --goal_reaching 1 --discrete 0 --dim_option 4 --trans_optimization_epochs 200 --n_epochs_per_log 25 --n_epochs_per_eval 125 --n_epochs_per_save 125 --n_epochs_per_pt_save 125 --encoder 1 --sample_cpu 0 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
# METRA:
python tests/main.py --env dmc_quadruped --max_path_length 200 --seed 0 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --video_skip_frames 2 --frame_stack 3 --sac_max_buffer_size 300000 --eval_plot_axis -15 15 -15 15 --algo metra --goal_reaching 0 --discrete 0 --dim_option 4 --trans_optimization_epochs 200 --n_epochs_per_log 25 --n_epochs_per_eval 125 --n_epochs_per_save 125 --n_epochs_per_pt_save 125 --encoder 1 --sample_cpu 0 --description "metra"
```

### Kitchen (Pixel)
```
# TLDR:
python tests/main.py --env kitchen --max_path_length 50 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --num_video_repeats 1 --frame_stack 3 --sac_max_buffer_size 100000 --algo tldr --goal_reaching 1 --discrete 0 --dim_option 4 --sac_lr_a -1 --trans_optimization_epochs 100 --n_epochs_per_log 25 --n_epochs_per_eval 250 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --encoder 1 --sample_cpu 0 --goal_reaching 1 --num_video_repeats 24 --dual_lam 3000 --q_layer_normalization 1 --exp_q_layer_normalization 1 --lr_te 5e-4 --sac_discount 0.97 --exploration_sac_discount 0.99 --description "tldr" --n_epochs 40002
# METRA:
python tests/main.py --env kitchen --max_path_length 50 --traj_batch_size 8 --n_parallel 4 --normalizer_type off --num_video_repeats 1 --frame_stack 3 --sac_max_buffer_size 100000 --algo metra --goal_reaching 0 --discrete 1 --dim_option 24 --sac_lr_a -1 --trans_optimization_epochs 100 --n_epochs_per_log 25 --n_epochs_per_eval 250 --n_epochs_per_save 1000 --n_epochs_per_pt_save 1000 --encoder 1 --sample_cpu 0 --description "metra"
```


If you use this code for your research, please consider citing our paper:
```
@article{bae2024tldr,
title={TLDR: Unsupervised Goal-Conditioned RL via Temporal Distance-Aware Representations},
author={Junik Bae and Kwanyoung Park and Youngwoon Lee},
journal={arXiv preprint arXiv:2407.08464},
year={2024}
}
```
1 change: 1 addition & 0 deletions d4rl/.github/FUNDING.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
github: Farama-Foundation
26 changes: 26 additions & 0 deletions d4rl/.github/ISSUE_TEMPLATE/bug.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
name: Bug Report
about: Submit a bug report
title: "[Bug Report] Bug title"
---

If you are submitting a bug report, please fill in the following details and use the tag [bug].

**Describe the bug**
A clear and concise description of what the bug is.

**Code example**
Please try to provide a minimal example to reproduce the bug. Error messages and stack traces are also helpful.

**System Info**
Describe the characteristic of your environment:
* Describe how `D4RL` was installed (pip, docker, source, ...)
* What OS/version of Linux you're using. Note that while we will accept PRs to improve Window's support, we do not officially support it.
* Python version

**Additional context**
Add any other context about the problem here.

### Checklist

- [ ] I have checked that there is no similar [issue](https://github.com/Farama-Foundation/d4rl/issues) in the repo (**required**)
31 changes: 31 additions & 0 deletions d4rl/.github/ISSUE_TEMPLATE/proposal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
name: Proposal
about: Propose changes that are not fixes bugs
title: "[Proposal] Proposal title"
---

### Proposal

A clear and concise description of the proposal.

### Motivation

Please outline the motivation for the proposal.
Is your feature request related to a problem? e.g.,"I'm always frustrated when [...]".
If this is related to another GitHub issue, please link here too.

### Pitch

A clear and concise description of what you want to happen.

### Alternatives

A clear and concise description of any alternative solutions or features you've considered, if any.

### Additional context

Add any other context or screenshots about the feature request here.

### Checklist

- [ ] I have checked that there is no similar [issue](https://github.com/Farama-Foundation/d4rl/issues) in the repo (**required**)
14 changes: 14 additions & 0 deletions d4rl/.github/ISSUE_TEMPLATE/question.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
name: Question
about: Ask a question
title: "[Question] Question title"
---

### Question

If you're a beginner and have basic questions, please ask on [r/reinforcementlearning](https://www.reddit.com/r/reinforcementlearning/)
or in the [RL Discord](https://discord.com/invite/xhfNqQv) (if you're new please use the beginners channel).
Basic questions that are not bugs or feature requests will be closed without reply,
because GitHub issues are not an appropriate venue for these.

Advanced/nontrivial questions, especially in areas where documentation is lacking, are very much welcome.
48 changes: 48 additions & 0 deletions d4rl/.github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Description

Please include a summary of the change and which issue is fixed.
Please also include relevant motivation and context.
List any dependencies that are required for this change.

Fixes # (issue)

## Type of change

Please delete options that are not relevant.

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] This change requires a documentation update

### Screenshots
Please attach before and after screenshots of the change if applicable.

<!--
Example:
| Before | After |
| ------ | ----- |
| _gif/png before_ | _gif/png after_ |
To upload images to a PR -- simply drag and drop an image while in edit mode, and it should upload the image directly.
You can then paste that source into the above before/after sections.
-->

# Checklist:

- [ ] I have run the [`pre-commit` checks](https://pre-commit.com/) with `pre-commit run --all-files` (see `CONTRIBUTING.md` instructions to set it up)
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes

<!--
As you go through the checklist above, you can mark something as done by putting an x character in it
For example,
- [x] I have done this task
- [ ] I have not done this task
-->
62 changes: 62 additions & 0 deletions d4rl/.github/stale.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Configuration for probot-stale - https://github.com/probot/stale

# Number of days of inactivity before an Issue or Pull Request becomes stale
daysUntilStale: 60

# Number of days of inactivity before an Issue or Pull Request with the stale label is closed.
# Set to false to disable. If disabled, issues still need to be closed manually, but will remain marked as stale.
daysUntilClose: 14

# Only issues or pull requests with all of these labels are check if stale. Defaults to `[]` (disabled)
onlyLabels:
- more-information-needed

# Issues or Pull Requests with these labels will never be considered stale. Set to `[]` to disable
exemptLabels:
- pinned
- security
- "[Status] Maybe Later"

# Set to true to ignore issues in a project (defaults to false)
exemptProjects: true

# Set to true to ignore issues in a milestone (defaults to false)
exemptMilestones: true

# Set to true to ignore issues with an assignee (defaults to false)
exemptAssignees: true

# Label to use when marking as stale
staleLabel: stale

# Comment to post when marking as stale. Set to `false` to disable
markComment: >
This issue has been automatically marked as stale because it has not had
recent activity. It will be closed if no further activity occurs. Thank you
for your contributions.
# Comment to post when removing the stale label.
# unmarkComment: >
# Your comment here.

# Comment to post when closing a stale Issue or Pull Request.
# closeComment: >
# Your comment here.

# Limit the number of actions per hour, from 1-30. Default is 30
limitPerRun: 30

# Limit to only `issues` or `pulls`
only: issues

# Optionally, specify configuration settings that are specific to just 'issues' or 'pulls':
# pulls:
# daysUntilStale: 30
# markComment: >
# This pull request has been automatically marked as stale because it has not had
# recent activity. It will be closed if no further activity occurs. Thank you
# for your contributions.

# issues:
# exemptLabels:
# - confirmed
Loading

0 comments on commit a2a9df7

Please sign in to comment.