-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathREADME
93 lines (63 loc) · 2.74 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
``gym-dmc``, OpenAI Gym Plugin for DeepMind Control Suite
=========================================================
Link to other OpenAI Gym Plugins:
- ``gym-sawyer``
- ``gym-toy-nav``
Update Log
----------
- **2024-03-25**: Return ``np.Array`` from ``env.render()`` function
- **2022-01-13**: Add space_dtype for overriding the dtype for the
state and action spaces. Default to None, need to set to
``float/np.float32`` for pytorch_SAC implementation.
- **2022-01-11**: Added a ``env._get_obs()`` method to allow one to
obtain the observation after resetting the environment. **Version:
``v0.2.1``**
Installation
------------
The ``dm_control`` dependency relies on lower versions of setuptools and
wheel. Downgrade to fix the installation error.
.. code-block:: shell
pip install setuptools==65.5.0
pip install wheel==0.38.4
pip install gym-dmc
How To Use
----------
Usage pattern:
.. code-block:: python
import gym
env = gym.make("dmc:Pendulum-swingup-v1")
For the full list of environments, you can print:
.. code-block:: python
from dm_control.suite import ALL_TASKS
print(*ALL_TASKS, sep="\n")
# Out[2]: ('acrobot', 'swingup')
# ('acrobot', 'swingup_sparse')
...
We register all of these environments using the following pattern:
acrobot task “swingup_sparse” becomes
``dmc:Acrobot-swingup_sparse-v1``
You can see the usage pattern in
`https://github.com/geyang/gym_dmc/blob/master/specs/test_gym_dmc.py <https://github.com/geyang/gym_dmc/blob/master/specs/test_gym_dmc.py>`__:
.. code-block:: python
env = gym.make('dmc:Walker-walk-v1', frame_skip=4, space_dtype=np.float32)
assert env.action_space.dtype is np.float32
assert env.observation_space.dtype is np.float32
env = gym.make('dmc:Walker-walk-v1', frame_skip=4)
assert env._max_episode_steps == 250
assert env.reset().shape == (24,)
env = gym.make('dmc:Walker-walk-v1', from_pixels=True, frame_skip=4)
assert env._max_episode_steps == 250
env = gym.make('dmc:Cartpole-balance-v1', from_pixels=True, frame_skip=8)
assert env._max_episode_steps == 125
assert env.reset().shape == (3, 84, 84)
env = gym.make('dmc:Cartpole-balance-v1', from_pixels=True, frame_skip=8, channels_first=False)
assert env._max_episode_steps == 125
assert env.reset().shape == (84, 84, 3)
env = gym.make('dmc:Cartpole-balance-v1', from_pixels=True, frame_skip=8, channels_first=False, gray_scale=True)
assert env._max_episode_steps == 125
assert env.reset().shape == (84, 84, 1)
**Note, the ``max_episode_steps`` is calculated based on the
``frame_skip``.** All DeepMind control domains terminate after 1000
simulation steps. So for ``frame_skip=4``, the ``max_episode_steps``
should be 250.
Built with :heart: by Ge Yang