Release 1.0.0

This release focuses on updating and improving Tianshou internals (in particular, code quality) while creating relatively few breaking changes (apart from things like the python and dependencies' versions).

We view it as a significant step for transforming Tianshou into the go-to place both for RL researchers, as well as for RL practitioners working on industry projects.

This is the first release after the appliedAI Institute (the TransferLab division) has decided to further develop Tianshou and provide long-term support.

Breaking Changes

dropped support of python<3.11
dropped support of gym, from now on only Gymnasium envs are supported
removed functions like offpolicy_trainer in favor of OffpolicyTrainer(...).run() (this affects all example scripts)
several breaking changes related to removing **kwargs from signatures, renamings of internal attributes (like critic1 -> critic)
Outputs of training methods are now dataclasses instead of dicts

Functionality Extensions

Major

High level interfaces for experiments, demonstrated by the new example scripts with names ending in _hl.py

Minor

Method to compute action directly from a policy's observation, can be used for unrolling
Support for custom keys in ReplayBuffer
Support for CalQL as part of CQL
Support for explicit setting of multiprocessing context for SubprocEnvWorker
critic2 no longer has to be explicitly constructed and passed if it is supposed to be the same network as critic (formerly critic1)

Internal Improvements

Build and Docs

Completely changed the build pipeline. Tianshou now uses poetry, black, ruff, poethepoet, nbqa and other niceties.
Notebook tutorials are now part of the repository (previously they were in a drive). They were fixed and are executed during the build as integration tests, in addition to serving as documentation. Parts of the content have been improved.
Documentation is now built with jupyter book. JavaScript code has been slightly improved, JS dependencies are included as part of the repository.
Many improvements in docstrings

Typing

Adding BatchPrototypes to cover the fields needed and returned by methods relying on batches in a backwards compatible way
Removing **kwargs from policies' constructors
Overall, much stricter and more correct typing. Removing kwargs and replacing dicts by dataclasses in several places.
Making use of Generic to express different kinds of stats that can be returned by learn and update
Improved typing in tests and examples, close to passing mypy

General

Reduced duplication, improved readability and simplified code in several places
Use dist.mode instead of inferring loc or argmax from the dist_fn input

Contributions

The OG creators

@Trinkle23897 participated in almost all aspects of the coordination and reviewed most of the merged PRs
@nuance1979 participated in several discussions

From appliedAI

The team working on this release of Tianshou consisted of @opcode81 @MischaPanch @maxhuettenrauch @carlocagnetta @bordeauxred

External contributions

@BFAnas participated in several discussions and contributed the CalQL implementation, extending the pre-processing logic.
@dantp-ai fixed many mypy issues and improved the tests
@arnaujc91 improved the logic of computing deterministic actions
Many other contributors, among them many new ones participated in this release. The Tianshou team is very grateful for your contributions!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1.0.0 - High level API, Improved Interfaces and Typing