Skip to content

Commit

Permalink
Minor improvements to versioning doc and proto comments. (onnx#3449)
Browse files Browse the repository at this point in the history
Mostly for improved readability.

Biggest changes:
* In the operator versioning section, replace `op_version` with
  `since_version`. `op_version` is not actually a concept in ONNX.
* In the operator set versioning example, remove references to the
  concept of deprecation. The versioning system does not actually have
  a deprecation mechanism. In the example, the operator A is not present
  in operator set version 4. Models that wish to use it will have to
  specify an opset version < 4.
* Clarifying what should be done to model versions
  when accuracy or performance changes. The previous text did not
  include any clear guidance.

Signed-off-by: Gary Miguel <[email protected]>
  • Loading branch information
garymm authored Apr 30, 2021
1 parent 82ddc56 commit 6e4f72b
Show file tree
Hide file tree
Showing 7 changed files with 49 additions and 55 deletions.
2 changes: 1 addition & 1 deletion RELEASE-MANAGEMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ This describes the process by which versions of ONNX are officially released to
Releases
--------

Releases are versioned according to [docs/Versioning.md](docs/Versioning.md). This describes IR and operator versioning policies, as well as propose how models themselves should be versioned.
Releases are versioned according to [ONNX Versioning](docs/Versioning.md). This describes IR and operator versioning policies, as well as propose how models themselves should be versioned.

On a regular basis, new versions of ONNX are published, representing the aggregate of changes in the IR and operator sets. Such releases use semantic versioning to describe the progression of the standard.

Expand Down
62 changes: 28 additions & 34 deletions docs/Versioning.md
Original file line number Diff line number Diff line change
@@ -1,44 +1,41 @@
<!--- SPDX-License-Identifier: Apache-2.0 -->

# ONNX versioning
# ONNX Versioning

This document describes the rules for versioning ONNX. Like the rest of the ONNX
specification, MUST, SHOULD et al are used consistent with [RFC2119](https://tools.ietf.org/html/rfc2119).
This document describes the rules for versioning ONNX. MUST, SHOULD et al are used consistent with [RFC2119](https://tools.ietf.org/html/rfc2119).

## Versioning Principles

ONNX defines the versioning policy and mechanism for three classes of entities:

* The abstract model for graphs and operators and the concrete format that represents them. These are always versioned atomically and are referred to as the *IR version*.
* The [intermediate representation (IR) specification](IR.md), which is the abstract model for graphs and operators and the concrete format that represents them. These are always versioned atomically and are referred to as the *IR version*.
* Operator specifications that may be referenced by a given ONNX graph. We refer to this as the *operator version*.
* A defined/trained model that defines a specific graph in terms of specific operators. We refer to this version as the *model version*.
* A defined/trained model that defines a specific graph in terms of specific operators. We refer to this as the *model version*.

The versioning of all three of these entity types is distinct and largely independent. That is, the ONNX IR format evolves at a different rate than the set operators defined by ONNX – in which the former will version much slower than the latter.
The versioning of all three of these entity types is distinct and largely independent. The IR specification evolves at a different (generally slower) rate than the operator specifications. Model versions are entirely independent of the other two versions.

While the versioning mechanisms are clearly specified in this document, specific policies for version management are mandated only for IR version and operator version. For model versioning, they are merely recommendations. For model version, ONNX users and systems MAY follow whichever local customs make sense; however, to facilitate easily managing shared collections of ONNX models, they SHOULD adhere to the policies described under model versioning.
Specific policies for version management are mandated only for IR version and operator version. For model versioning, they are merely recommendations. For model versioning, ONNX users and systems MAY follow whichever local customs make sense; however, to facilitate easily managing shared collections of ONNX models, they SHOULD adhere to the policies described under model versioning.

In addition to versioning ONNX entities, progressive ONNX _releases_ are assigned increasing version numbers. The release versioning scheme is not described as part of the standard itself. It is discussed in the [ONNX release management document](../RELEASE-MANAGEMENT.md).
New IR and operator versions are released as part of ONNX _releases_, which have their own versioning scheme. The release versioning scheme is not described as part of the standard itself. It is discussed in the [ONNX release management document](../RELEASE-MANAGEMENT.md).

### Semantic Versioning or Simple Numbers?

The ONNX versioning system allows for simple monotonically increasing numbers or semantic versioning. For IR and operator sets, versioning is based on simple numbers. For models, ONNX does not proscribe one or the other methodology, but (as stated earlier) recommends a set of shared conventions.
The ONNX versioning system allows for simple monotonically increasing numbers or [semantic versioning (SemVer)](https://semver.org/). For IR and operator sets, versioning is based on simple numbers. For models, ONNX does not require any scheme, but recommends a set of shared conventions.

Which versioning scheme is in use by a model is made clear by inspecting the most significant four bytes, which MUST be non-zero when using semantic versioning and MUST be zero when using simple numbers. In other words, when using semver, at least one of the MAJOR or MINOR numbers must be non-zero.
Which versioning scheme is in use by a model is made clear by inspecting the most significant four bytes, which MUST be non-zero when using semantic versioning and MUST be zero when using simple numbers. In other words, when using SemVer, at least one of the MAJOR or MINOR numbers must be non-zero.

### SemVer, Files and Frameworks
### SemVer, Files and Consumers

For model and release versioning, ONNX builds on the principles and syntax defined by [SemVer 2.0.0](http://semver.org/spec/v2.0.0.html). Throughout this document, we use the terms *breaking change*, *non-breaking change*, and *patch* consistent with SemVer 2.0.0.

Because ONNX models are serialized files (not APIs), it's worth making clear the relationship between a serialized model and a piece of software that consumes that model. As a rough approximation, the serialized model plays the role of an API's *callee*, while the consumer of the serialized model plays the role of the API's *caller*.

The ONNX versioning principles are based on [Postel's law](https://en.wikipedia.org/wiki/Robustness_principle)be conservative in what you do, be liberal in what you accept from others.
The ONNX versioning principles are based on the [robustness principle](https://en.wikipedia.org/wiki/Robustness_principle): "be conservative in what you do, be liberal in what you accept from others".

1. A producer of a given ONNX model (and the ONNX specification itself) MUST strictly adhere to the rules for breaking vs. non-breaking changes defined in this specification.
2. A consumer of a given ONNX model SHOULD consume an updated ONNX file, provided there are no breaking changes in the new ONNX file's IR version, referenced operator versions, or model version (e.g., the MAJOR version numbers have not changed between the two ONNX files).
2. A consumer of a given ONNX model SHOULD consume an updated ONNX file, provided there are no breaking changes in the new ONNX file's IR version, referenced operator versions, or model version (meaning the MAJOR version numbers have not changed between the two ONNX files).
3. A consumer of a given ONNX model MAY consume an updated ONNX file, provided there are one or more breaking changes in the new ONNX file's IR version, referenced operator versions, or model version.

The operational rules specifying how the ONNX project is managed are documented [here](../RELEASE-MANAGEMENT.md).

### Serializing SemVer version numbers in protobuf

For efficiency, ONNX serializes the MAJOR, MINOR, and PATCH values as a bit-packed 64-bit integer; the two most significant bytes are the MAJOR component, the next two most significant bytes are the MINOR component, and the least significant four bytes are the PATCH component.
Expand All @@ -49,25 +46,25 @@ Pre-release and build metadata are not stored in the model.

## IR versioning

The IR file format is versioned using simple numbers, which MUST be monotonically increasing. Breaking changes to the format or semantics of the ONNX specification require an increment of the version. Non-breaking changes to the IR format do not require changing the version number.
The IR format is versioned using simple numbers, which MUST be monotonically increasing. Breaking changes to the format or semantics of the ONNX specification require an increment of the version. Non-breaking changes to the IR format do not require changing the version number.

NOTE: breaking changes include those that do not alter the serialized binary format, but still break software using libraries that write or read it. For example, changing the spelling of a message property will cause code accessing the property break.

The ONNX IR format adheres to the versioning guidelines defined in the [Updating a Message Type](https://developers.google.com/protocol-buffers/docs/proto3#updating) section of the proto3 specification.
The IR format adheres to the versioning guidelines defined in the [Updating a Message Type](https://developers.google.com/protocol-buffers/docs/proto3#updating) section of the proto3 specification.

As a general principle, implementations SHOULD be robust in the face of missing fields. However, to ensure basic interoperation, a subset of message fields will be marked as required for a given IR version and all producers MUST set these fields correctly. Required fields MUST always be marked with the comment:

// This field MUST be present for this version of the IR.

For example, the `ModelProto.ir_version` property MUST be present in every model. The ONNX checker (`onnx/checker.py`) will enforce these rules.

Because onnx.proto is expected to be consumed by multiple independent developers, changes to onnx.proto SHOULD NOT break code that depends on generated language bindings (e.g., changing the type of an existing field).
Because the protocol buffer message definitions (.proto / .proto3 files) are expected to be consumed by multiple independent developers, changes to those definitions SHOULD NOT break code that depends on generated language bindings (e.g., changing the type of an existing field).

## Operator versioning

ONNX is defined such that the IR can evolve independently from the set of operators. In ONNX, operators represent both the signature and semantics of a given operation. Operators are abstract interfaces in that they do not imply a specific implementation; rather, they are simply the contract between a model author and the implementations that model may execute on.
The IR can evolve independently from the set of operators. Operators represent both the signature and semantics of a given operation. Operators are abstract interfaces in that they do not imply a specific implementation; rather, they are simply the contract between a model author and the implementations that model may execute on.

A given operator is identified by a three-tuple: `(domain, op_type, and op_version)`. This is written as `domain.op_type:op_version` in prose (e.g., `com.acme.FastConv:3`). Nodes in graphs always refer to operators by their three-part identifier. Breaking operator changes include:
A given operator is identified by a three-tuple: `(domain, op_type, since_version)`, written as `domain.op_type:since_version` in prose (e.g., `com.acme.FastConv:3`). `since_version` is the version of the operator set that introduced the operator. Breaking operator changes include:

* Adding/removing/renaming an attribute. This even includes the case of adding a new optional attribute, where omitting the attribute would imply a default value yielding semantics identical to the previous operator version.

Expand All @@ -82,9 +79,7 @@ The following are not breaking:
* Clarifications of specification ambiguities to match prevailing
implementation practice.

If the semantics of an operator or function are changed, you MUST create a new operator; the `op_version` of the new
operator id MUST be greater than any extant `op_version` for the
`domain`.
Changes to the semantics of an operator or function MUST be introduced in a new operator, which MUST be introduced in a new [operator set](#operator-sets).

> In practice, this means that BC-breaking changes in the ONNX
> repository require contributors to follow these steps:
Expand All @@ -96,15 +91,15 @@ operator id MUST be greater than any extant `op_version` for the
> 4. Register the new operator in the corresponding `operator_sets`
> header file.
How nodes bind to operator declarations is strictly defined, and are designed to increase model compatibility across ONNX implementations (appealing to the conservative clause of the robustness principle).
How nodes bind to operator declarations is strictly defined, and are designed to increase model compatibility across ONNX implementations, in the spirit of the conservative clause of the robustness principle.

How ONNX implementations bind an operator declaration to specific implementation is outside the scope of this specification. Implementations of ONNX MAY elect to introduce more sophisticated operator declaration/implementation binding modes to appeal to the liberal clause of the robustness principle.
How ONNX implementations bind an operator declaration to a specific implementation is outside the scope of this specification. Implementations of ONNX MAY elect to introduce more sophisticated operator declaration/implementation binding modes, in the spirit of the liberal clause of the robustness principle.

### Operator sets

ONNX uses operator sets to group together immutable operator specifications. An ONNX operator set specifies both the domain of all operators it includes, as well as a version (referred to as the `opset` version). The opset version is largely independent of the version field of the operators it includes. When the inventory of a given operator set changes either by addition or removal, its opset version MUST increase. Moreover, the opset version MUST be no less than the highest operator version number in the set.
ONNX uses operator sets to group together immutable operator specifications. An operator set represents a specific version of a domain, indicated by a pair (domain, version). This represents the set of all operators belonging to the specified domain with the specified version (referred to as the `opset_version`). When the inventory of a given operator set changes either by addition, removal, or a change in semantics of a contained operator, its version MUST increase.

ONNX models declare which operator sets they require as a list of two part operator ids (domain, opset_version). The empty string ("") domain indicates the operators defined as part of the ONNX specification; other domains correspond to operator sets of other vendors (e.g., they can be used to provide vendor-specific extensions to ONNX). The union of the operator sets specified by a given model MUST have a compatible operator declaration for each node in the model's graph.
Models declare which operator sets they require as a list of `(domain, opset_version)` pairs in `ModelProto.opset_import`. The empty string ("") domain indicates the operators defined as part of the ONNX specification; other domains correspond to operator sets of other vendors (meaning they can be used to provide vendor-specific extensions to ONNX). The union of the operator sets specified by a given model MUST have a compatible operator declaration for each node in the model's graph.

### Example

Expand All @@ -117,23 +112,22 @@ OpSet|Operators|Comments
1|{A} | A introduced
2|{A, B} | B introduced
3|{A', B, C} | A updated (to A'), C introduced
4|{B, C'} | A deprecated and removed, C updated (to C')
4|{B, C'} | A removed, C updated (to C')

The operators for a given operator set will have the following `op_version` values:
The operators for a given operator set will have the following `since_version` values:

Operator|OpSet 1|OpSet 2|OpSet 3|OpSet 4
-|-|-|-|-
A|**1** |1 |**3** |**4\***
A|**1** |1 |**3** |**-**
B|- |**2** |2 |2
C|- |- |**3** |**4**

Notes:
- Values that are new or updated from a previous OpSet version are in **bold**.
- \*: the operator is deprecated.

## Model versioning

Model versioning is ultimately the domain of a given organization. Therefore, this section of the specification is not normative. It simply outlines a set of recommended practices.
This section of the specification is not normative. It simply outlines a set of recommended practices.

Model authors and applications/systems MAY elect to ignore the model versioning mechanism and policy rules. For models that will be shared across developers, teams, or organizations, model authors and applications/systems SHOULD adhere to the following version policies:

Expand All @@ -156,11 +150,11 @@ Model authors and applications/systems MAY elect to ignore the model versioning

### Accuracy or performance changes

Assuming that there are no breaking changes to the signature of the model's graph or any operator dependencies, the shape and contents of the graph can change freely provided there are no semantic changes to the model. However, changes to the shape and contents of the graph can impact model accuracy and/or model performance.
Changes that impact accuracy or performance significantly but do not change the model's inputs or outputs SHOULD increment the PATCH version of `ModelProto.model_version`.

## Released Versions

ONNX version|File format version|Opset version ai.onnx|Opset version ai.onnx.ml|Opset version ai.onnx.training
ONNX version|IR version|Opset version ai.onnx|Opset version ai.onnx.ml|Opset version ai.onnx.training
------------|-------------------|---------------------|------------------------|------------------------------
1.0|3|1|1|-
1.1|3|5|1|-
Expand Down
8 changes: 4 additions & 4 deletions onnx/onnx-operators-ml.proto
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ message FunctionProto {
// The first version of a function set which contains this function.
// When there's any breaking change for this function, the function set
// contains the function needs to bump its version, and since_version of
// the updated function will be changed to the updated function set version.
// the updated function will be changed to the updated function set version.
optional int64 since_version = 2;

// This field indicates whether the syntax, semantics, or presence
Expand Down Expand Up @@ -163,9 +163,9 @@ message OperatorSetProto {
optional string domain = 4;

// The version of the set of operators. This is a simple int value
// that is monotonically increasing as new versions of operator set
// are published. All operators in this set MUST have version
// numbers no greater than opset_version.
// that is monotonically increasing as new versions of the operator set
// are published. All operators in this set MUST have since_version
// <= opset_version.
optional int64 opset_version = 5;

// A human-readable documentation for this set of operators. Markdown is allowed.
Expand Down
8 changes: 4 additions & 4 deletions onnx/onnx-operators-ml.proto3
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ message FunctionProto {
// The first version of a function set which contains this function.
// When there's any breaking change for this function, the function set
// contains the function needs to bump its version, and since_version of
// the updated function will be changed to the updated function set version.
// the updated function will be changed to the updated function set version.
int64 since_version = 2;

// This field indicates whether the syntax, semantics, or presence
Expand Down Expand Up @@ -163,9 +163,9 @@ message OperatorSetProto {
string domain = 4;

// The version of the set of operators. This is a simple int value
// that is monotonically increasing as new versions of operator set
// are published. All operators in this set MUST have version
// numbers no greater than opset_version.
// that is monotonically increasing as new versions of the operator set
// are published. All operators in this set MUST have since_version
// <= opset_version.
int64 opset_version = 5;

// A human-readable documentation for this set of operators. Markdown is allowed.
Expand Down
Loading

0 comments on commit 6e4f72b

Please sign in to comment.