forked from opendatahub-io/kserve
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update KServe 2023 Roadmap (kserve#2526)
* Update ROADMAP.md Signed-off-by: Dan Sun <[email protected]> * Apply suggestions from code review Co-authored-by: Alexa Griffith <[email protected]> Signed-off-by: Dan Sun <[email protected]> * Address comments Signed-off-by: Dan Sun <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Alexa Griffith <[email protected]>
- Loading branch information
1 parent
47a4e36
commit d7778ef
Showing
1 changed file
with
62 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,25 +1,62 @@ | ||
# KServe Roadmap | ||
## 2021 Q4/2022 Q1 | ||
|
||
### Kubernetes Deployment | ||
Objective: "Enable raw kubernetes deployment as alternative mode" | ||
* Support existing ML frameworks, transformer/explainer, logger and batching | ||
* Make Istio/KNative optional and unlock KNative limitations | ||
* Allow multiple volumes mounted | ||
* Allow TCP/UDP | ||
|
||
### Inference Graph | ||
Objective: "Enable model serving pipelines with flexible routing graph" | ||
* Inference Router | ||
* Model Experimentation. | ||
* Ensembling. | ||
* Multi Arm Bandit. | ||
* Pipeline | ||
Proposal: https://docs.google.com/document/d/1rV8kI_40oiv8jMhY_LwkkyKLdOwSI1Qda-Dc6Dgjz1g | ||
|
||
### ModelMesh | ||
Objective: "Unifying interface for SingleModel and ModelMesh deployment" | ||
* Ability to perform inference using Predict v2 API with REST/gRPC | ||
* Unify the storage support for single and ModelMesh | ||
* InferenceService controller to utilize ServingRuntime | ||
* Single install for KServe which includes SingleModel and ModelMesh Serving | ||
# KServe 2023 Roadmap | ||
|
||
## Objective: "Graduate core inference capability to stable/GA" | ||
- Promote `InferenceService` and `ClusterServingRuntime`/`ServingRuntime` CRD from v1beta1 to v1 | ||
* Improve `InferenceService` CRD for REST/gRPC protocol interface | ||
* Unify model storage spec and implementation between KServe and ModelMesh | ||
* Add Status to `ServingRuntime` for both ModelMesh and KServe, surface `ServingRuntime` validation errors and deployment status | ||
* Deprecate `TrainedModel` CRD and use `InferenceService` annotation to allow dynamic model updates as alternative option to storage initializer | ||
* Collocate transformer and predictor in the pod to reduce sidecar resources and networking latency | ||
* Stablize `RawDeployment` mode with comprehensive testing for supported features | ||
|
||
- All model formats to support v2 inference protocol including custom serving runtime | ||
* TorchServe to support v2 gRPC inference protocol | ||
* Support batching for v2 inference protocol | ||
* Transformer and Explainer v2 inference protocol interoperability | ||
* Improve codec for v2 inference protocol | ||
|
||
Reference: [Control plane issues](https://github.com/kserve/kserve/issues?q=is%3Aissue+is%3Aopen+label%3Akserve%2Fcontrol-plane), [Data plane issues](https://github.com/kserve/kserve/issues?q=is%3Aissue+is%3Aopen+label%3Akfserving%2Fdata-plane),[Serving Runtime issues](https://github.com/kserve/kserve/issues?q=is%3Aissue+is%3Aopen+label%3Akserve%2Fservingruntime). | ||
|
||
## Objective: "Graduate KServe Python SDK to 1.0“ | ||
|
||
- Improve KServe Python SDK dependency management with Poetry | ||
- Create standarized model packaging API | ||
- Improve KServe model server observability with metrics and distruted tracing | ||
- Support batch inference | ||
|
||
Reference:[Python SDK issues](https://github.com/kserve/kserve/issues?q=is%3Aissue+is%3Aopen+label%3Akserve%2Fsdk), [Storage issues](https://github.com/kserve/kserve/issues?q=is%3Aissue+is%3Aopen+label%3Akfserving%2Fstorage) | ||
|
||
## Objective: "Graduate ModelMesh to beta" | ||
- Support TorchServe ServingRuntime | ||
- Add PVC support and unify storage implementation with KServe | ||
- Add optional ingress for ModelMesh deployments | ||
- Etcd secret security for multi-namespace mode | ||
- Add estimated model size field | ||
|
||
Reference: [ModelMesh issues](https://github.com/kserve/modelmesh-serving/issues?page=1&q=is%3Aissue+is%3Aopen) | ||
|
||
## Objective: "Graduate InferenceGraph to beta" | ||
- Improve `InferenceGraph` spec for replica and concurrency control | ||
- Allow setting resource limits per `InferenceGraph` | ||
- Support distributed tracing | ||
- Support gRPC for `InferenceGraph` | ||
- Standalone `Transformer` support for `InferenceGraph` | ||
- Support traffic mirroring node | ||
- Support `RawDeployment` mode for `InferenceGraph` | ||
|
||
Reference: [InferenceGraph issues](https://github.com/kserve/kserve/issues?q=is%3Aissue+is%3Aopen+label%3Akserve%2Finference_graph) | ||
|
||
## Objective: "Secure InferenceService" | ||
- Document KServe ServiceMesh setup with mTLS | ||
- Support programmatic authentication token | ||
- Implement per service level auth | ||
- Add support for SPIFFE/SPIRE identity integration with `InferenceService` | ||
|
||
Reference: [Auth related issues](https://github.com/kserve/kserve/issues?q=is%3Aissue+is%3Aopen+label%3Akserve%2Fauth) | ||
|
||
## Objective: "KServe 1.0 documentation" | ||
- Add ModelMesh docs and explain the use cases for classic KServe and ModelMesh | ||
- Unify the data plane v1 and v2 page formats | ||
- Improve v2 data plane docs to tell the story why and what changed | ||
- Clean up the examples in kserve repo and unify them with the website's by creating one source of truth for example documentation | ||
- Update any out-of-date documentation and make sure the website as a whole is consistent and cohesive |