Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(zjow): add Implicit Q-Learning #821

Merged
merged 9 commits into from
Jan 27, 2025
363 changes: 363 additions & 0 deletions ding/model/template/qvac.py

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions ding/policy/command_mode_policy_instance.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@

from .d4pg import D4PGPolicy
from .cql import CQLPolicy, DiscreteCQLPolicy
from .iql import IQLPolicy
from .dt import DTPolicy
from .pdqn import PDQNPolicy
from .madqn import MADQNPolicy
Expand Down Expand Up @@ -321,6 +322,11 @@ class CQLCommandModePolicy(CQLPolicy, DummyCommandModePolicy):
pass


@POLICY_REGISTRY.register('iql_command')
class IQLCommandModePolicy(IQLPolicy, DummyCommandModePolicy):
pass


@POLICY_REGISTRY.register('discrete_cql_command')
class DiscreteCQLCommandModePolicy(DiscreteCQLPolicy, EpsCommandModePolicy):
pass
Expand Down
Loading
Loading