-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
doc: fix some grammar errors and refine some expressions
- Loading branch information
Showing
1 changed file
with
39 additions
and
47 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,80 +1,72 @@ | ||
# Model Format Specification | ||
|
||
The specification defines an open standard Artifacial Intelligence model. It is defined through the artifact extension based on [the OCI image specification](https://github.com/opencontainers/image-spec/blob/main/spec.md#image-format-specification), and extends model features through `artifactType` and `annotations`. Model storage and distribution can be optimized based on artifact extension. | ||
The specification defines an open standard for packaging and distribution Artificial Intelligence models as OCI artifacts, adhering to [the OCI image specification](https://github.com/opencontainers/image-spec/blob/main/spec.md#image-format-specification). | ||
|
||
The goal of this specification is to package models in an OCI artifact to take advantage of OCI distribution and ensure efficient model deployment. | ||
The goal of this specification is to outline a blueprint and enable the creation of interoperable solutions for packaging and retrieving AI/ML models by leveraging the existing OCI ecosystem, thereby facilitating efficient model management, deployment and serving in cloud-native environments. | ||
|
||
The model specification needs to consider two factors: | ||
## Use Cases | ||
|
||
1. The model needs to be stored in the OCI registry and display the parameters of the model. So that the model should use | ||
the [artifact extension](https://github.com/opencontainers/image-spec/blob/main/artifacts-guidance.md) to | ||
packaging content other than OCI image specification. | ||
2. The model needs to be mounted by the container runtime as | ||
[read only volumes based on the OCI Artifacts in Kubernetes 1.31+](https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/). | ||
Container runtimes can only pull OCI artifact that follow the OCI image specification. | ||
|
||
Therefore, the model specification must be defined through the artifact extension based on the [OCI image specification](https://github.com/opencontainers/image-spec/blob/main/spec.md#image-format-specification). It can be better compatible with the kubernetes ecosystem. | ||
* An OCI Registry could storage and manage AI/ML model artifacts with model versions, metadata, and parameters retrievable and displayable. | ||
* A Data Scientist can package models together with their metadata (e.g., format, precision) and upload them to a registry, facilitating collaboration with MLOps Engineers while streamlining the deployment process to efficiently deliver models into production. | ||
* A model serving/deployment platform can read model metadata (e.g., format, precision) from a registry to understand the AI/ML model details, identify the required server runtime | ||
(as well as startup parameters, necessary resources, etc.), and serve the model in Kubernetes by [mounting it directly as a volume source](https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/) | ||
without needing to pre-download it in an init-container or bundle it within the server runtime container. | ||
|
||
## Overview | ||
|
||
The model specification is defined through the artifact extension based on the OCI image specification, and extend model features through `artifactType` and `annotations`. Model storage and distribution can be optimized based on artifact extension. | ||
The model specification follows [OCI image format specification](https://github.com/opencontainers/image-spec/blob/main/spec.md#image-format-specification) and leverages its [guidelines for artifacts usage](https://github.com/opencontainers/image-spec/blob/main/manifest.md#guidelines-for-artifact-usage) for packaging AI/ML models along with their associated metadata and configurations. | ||
|
||
![manifest](./img/manifest.svg) | ||
|
||
## Workflow | ||
## Understanding the Specification | ||
|
||
The model specification running workflow is divided into two stages: `BUILD & PUSH` and `PULL & SERVE`. | ||
The model specification follows the [guidelines for artifacts usage](https://github.com/opencontainers/image-spec/blob/main/manifest.md#guidelines-for-artifact-usage) from the [OCI image format specification](https://github.com/opencontainers/image-spec/blob/main/spec.md#image-format-specification). Specifically, it utilizes the `artifactType` and `annotations` properties to characterize and enrich model artifacts. | ||
|
||
### BUILD & PUSH | ||
### Image Manifest Properties | ||
|
||
Use tools(ORAS, Ollama, etc.) to build required resources in the model repository into artifact based on the model specification. Note that the model layer MUST NOT be compressed, because the files of model weight has been compressed. If the model layer is compressed, the container runtime will cost long time to decompress the model layer. Therefore, it's recommended to use the `application/vnd.oci.image.layer.v1.tar` format for the model layer to avoid compression | ||
- **`artifactType`** _string_ | ||
|
||
Next push the artifact to the OCI registry(Harbor, Docker Hub, etc.), and use the functionalities of the OCI registry to manage the model artifact. | ||
This REQUIRED property MUST be `application/vnd.cnai.model.manifest.v1+json`. | ||
|
||
![build-push](./img/build-and-push.png) | ||
- **`layers`** _array of objects_ | ||
|
||
### PULL & SERVE | ||
- **`mediaType`** _string_ | ||
|
||
The container runtime(containerd, CRI-O, etc) pulls the model artifact from the OCI registry, and mounts the model artifact as a read-only volume. Therefore, distributed model can use the P2P technology(Dragonfly, Kraken, etc) to reduce the pressure on the registry and preheat the model artifact into each node. If the model artifact is already present on the node, the container runtime can reuse the model artifact to mount different containers in the same node. | ||
This REQUIRED property MUST be one of the [OCI Image Media Types](https://github.com/opencontainers/image-spec/blob/main/media-types.md) designated for [layers](https://github.com/opencontainers/image-spec/blob/main/layer.md). | ||
Otherwise, it will not be compatible with the container runtime. | ||
|
||
![pull-serve](./img/pull-and-serve.png) | ||
|
||
## Understanding the Specification | ||
- **`artifactType`** _string_ | ||
|
||
The model specification is based on the [OCI image specification](https://github.com/opencontainers/image-spec/blob/main/spec.md) and focuses on defining the artifact extension according to the [Artifacts Guidance](https://github.com/opencontainers/image-spec/blob/main/artifacts-guidance.md). | ||
This REQUIRED property MUST be at least the following media types: | ||
|
||
### Image Manifest Extension Properties | ||
- `application/vnd.cnai.model.layer.v1.tar`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) that contains the model weight file. If the model has multiple weight files, they SHOULD be packaged into separate layers. | ||
- `application/vnd.cnai.model.layer.v1.tar+gzip`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) compressed with [gzip](https://datatracker.ietf.org/doc/html/rfc1952) that contains the model weight file. | ||
If the model has multiple weight files, they SHOULD be packaged in separate layers. | ||
|
||
It is recommended to package weight files without compression to avoid unnecessary overhead of decompression by the container runtime as model weight files are typically already compressed. | ||
- `application/vnd.cnai.model.doc.v1.tar`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) that includes documentation files like `README.md`, `LICENSE`, etc. | ||
- `application/vnd.cnai.model.config.v1.tar`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) that includes additional metadata and configuration files such as `config.json`,`tokenizer.json`, `generation_config.json`, etc. | ||
|
||
- **`artifactType`** _string_ | ||
- **`annotations`** _string-string map_ | ||
|
||
This REQUIRED property MUST contain the media type `application/vnd.cnai.model.manifest.v1+json`. | ||
This OPTIONAL property contains arbitrary metadata for the layer. For metadata specific to models, implementations SHOULD use the predefined annotation keys as outlined in the [Layer Annotation Keys](./annotations.md#layer-annotation-keys). | ||
|
||
- **`layers`** _array of objects_ | ||
## Workflow | ||
|
||
- **`mediaType`** _string_ | ||
As the model specification conforms to the [OCI image specification](https://github.com/opencontainers/image-spec/blob/main/layer.md), it naturally aligns with the standard [OCI distribution workflow](https://github.com/opencontainers/distribution-spec/blob/main/spec.md). | ||
|
||
`mediaType` MUST follow the [OCI image specification](https://github.com/opencontainers/image-spec/blob/main/layer.md), because the model needs to be mounted | ||
by the container runtime as [read only volumes based on the OCI Artifacts in Kubernetes 1.31+](https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/). | ||
Container runtimes can only pull OCI artifact that follow the OCI image specification. | ||
This section outlines the typical workflow for a model OCI artifact, which consists of two main stages: `BUILD & PUSH` and `PULL & MOUNT`. | ||
|
||
- **`artifactType`** _string_ | ||
### BUILD & PUSH | ||
|
||
Implementations MUST support at least the following media types: | ||
Build tools can package required resources into an OCI artifact following the model specification. | ||
|
||
- `application/vnd.cnai.model.layer.v1.tar`: The layer is a tarball that contains the model weight file. If the model has multiple weight files, | ||
need to package them in separate layers. | ||
- `application/vnd.cnai.model.layer.v1.tar+gzip`: The layer is a tarball that contains the model weight file and is compressed by gzip. | ||
If the model has multiple weight files, need to package them in separate layers. But recommended package model weight files without compressed to | ||
avoid the container runtime decompressing the model layer. Because the model weight files have been compressed, the container runtime will | ||
cost long time to decompress the model layer. | ||
- `application/vnd.cnai.model.doc.v1.tar`: The layer is a tarball that contains the model documentation file, such as README.md, LICENSE, etc. | ||
- `application/vnd.cnai.model.config.v1.tar`: The layer is a tarball that contains the model configuration file, | ||
such as config.json, tokenizer.json, generation_config.json, etc. | ||
The generated artifact can then be pushed to an OCI registry (e.g., Harbor, DockerHub) for storage and management. | ||
|
||
- **`annotations`** _string-string map_ | ||
![build-push](./img/build-and-push.png) | ||
|
||
This OPTIONAL property contains arbitrary metadata for the layer. For model specification, SHOULD set the pre-defined annotation keys, refer to the [Layer Annotation Keys](./annotations.md#layer-annotation-keys). | ||
### PULL & MOUNT | ||
|
||
- **`annotations`** _string-string map_ | ||
Once the model artifact is stored in an OCI registry, the container runtime (e.g., containerd, CRI-O) can pull it from the OCI registry and mount it as a read-only volume during the model serving process, if required. | ||
|
||
This OPTIONAL property contains arbitrary metadata for the image manifest. For model specification, SHOULD set the pre-defined annotation keys, refer to the [Manifest Annotation Keys](./annotations.md#manifest-annotation-keys). | ||
![pull-serve](./img/pull-and-serve.png) |