1.0.0 - High level API, Improved Interfaces and Typing
Release 1.0.0
This release focuses on updating and improving Tianshou internals (in particular, code quality) while creating relatively few breaking changes (apart from things like the python and dependencies' versions).
We view it as a significant step for transforming Tianshou into the go-to place both for RL researchers, as well as for RL practitioners working on industry projects.
This is the first release after the appliedAI Institute (the TransferLab division) has decided to further develop Tianshou and provide long-term support.
Breaking Changes
- dropped support of python<3.11
- dropped support of gym, from now on only Gymnasium envs are supported
- removed functions like
offpolicy_trainer
in favor ofOffpolicyTrainer(...).run()
(this affects all example scripts) - several breaking changes related to removing
**kwargs
from signatures, renamings of internal attributes (likecritic1
->critic
) - Outputs of training methods are now dataclasses instead of dicts
Functionality Extensions
Major
- High level interfaces for experiments, demonstrated by the new example scripts with names ending in
_hl.py
Minor
- Method to compute action directly from a policy's observation, can be used for unrolling
- Support for custom keys in ReplayBuffer
- Support for CalQL as part of CQL
- Support for explicit setting of multiprocessing context for SubprocEnvWorker
critic2
no longer has to be explicitly constructed and passed if it is supposed to be the same network ascritic
(formerlycritic1
)
Internal Improvements
Build and Docs
- Completely changed the build pipeline. Tianshou now uses poetry, black, ruff, poethepoet, nbqa and other niceties.
- Notebook tutorials are now part of the repository (previously they were in a drive). They were fixed and are executed during the build as integration tests, in addition to serving as documentation. Parts of the content have been improved.
- Documentation is now built with jupyter book. JavaScript code has been slightly improved, JS dependencies are included as part of the repository.
- Many improvements in docstrings
Typing
- Adding
BatchPrototypes
to cover the fields needed and returned by methods relying on batches in a backwards compatible way - Removing
**kwargs
from policies' constructors - Overall, much stricter and more correct typing. Removing
kwargs
and replacing dicts by dataclasses in several places. - Making use of
Generic
to express different kinds of stats that can be returned bylearn
andupdate
- Improved typing in
tests
andexamples
, close to passing mypy
General
- Reduced duplication, improved readability and simplified code in several places
- Use
dist.mode
instead of inferringloc
orargmax
from thedist_fn
input
Contributions
The OG creators
- @Trinkle23897 participated in almost all aspects of the coordination and reviewed most of the merged PRs
- @nuance1979 participated in several discussions
From appliedAI
The team working on this release of Tianshou consisted of @opcode81 @MischaPanch @maxhuettenrauch @carlocagnetta @bordeauxred
External contributions
- @BFAnas participated in several discussions and contributed the CalQL implementation, extending the pre-processing logic.
- @dantp-ai fixed many mypy issues and improved the tests
- @arnaujc91 improved the logic of computing deterministic actions
- Many other contributors, among them many new ones participated in this release. The Tianshou team is very grateful for your contributions!