-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds InferenceExtension to GatewayParameters #10601
base: main
Are you sure you want to change the base?
Conversation
// InferenceExtension defines the desired state of the Gateway API inference extension. | ||
// For additional details, see: https://gateway-api-inference-extension.sigs.k8s.io/. | ||
// | ||
// InferenceExtension can only be specified when GatewayParameters is referenced by a Gateway. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: The inferenceExtension
field is only supported when the GatewayParameters is referenced by a Gateway until #10594 is resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we specify on a gateway class, why can't a gateway "inherit" it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuval-k thanks for the review and feedback. Since multiple Gateways can reference a GatewayClass the EndpointPickerExtension API would need to associate a Gateway to an InferencePool. This can be achieved using a list map keyed on the Gateway namespaced name. For example:
type InferenceExtension struct {
// +listType=map
// +listMapKey=gatewayNsName
// +kubebuilder:validation:MaxItems=8
EndpointPickers []EndpointPickerExtension `json:"endpointPickers"`
}
Each EndpointPickerExtension must now define gatewayNsName: <GW_NS>/<GW_NAME>
. This is acceptable when GatewayParameters is referenced by a GatewayClass but is overly verbose when GatewayParameters is referenced by a Gateway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to associate Gateway to an InferencePool? i thought InferencePool represents the upstream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An extension is associated to an InferencePool:
apiVersion: apps/v1
kind: Deployment
metadata:
name: inference-gateway-ext-proc
...
containers:
- name: inference-gateway-ext-proc
image: us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/epp:main
args:
- -poolName
- "test-infer-pool"
InferencePool does represent the upstream. The implementation can reconcile HTTPRoutes with an InferencePool backendRef and use the connection configuration from the InferencePool to configure the parentRefs Gateways. I was trying to be explicit with the initial implementation, e.g. associate an extension with a Gateway. I'll work through the implementation without the additional verbosity and see if I hit any roadblocks.
@@ -612,6 +620,52 @@ type AiExtension struct { | |||
Stats *AiExtensionStats `json:"stats,omitempty"` | |||
} | |||
|
|||
// InferenceExtension defines the desired state of the Gateway API inference extension. | |||
// For additional details, see: https://gateway-api-inference-extension.sigs.k8s.io/. | |||
type InferenceExtension struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: Additional extensions will be introduced in future releases of Gateway API inference extension.
Do we need to add this field to the default GW params object in install/helm/kgateway? |
// EndpointPickers defines a list of EndpointPicker extensions. | ||
// | ||
// +kubebuilder:validation:MaxItems=8 | ||
EndpointPickers []EndpointPickerExtension `json:"endpointPickers"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
omitempty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
omitempty
should only be used with optional fields and this field is required (xref).
// this GatewayParameters. | ||
// | ||
// +kubebuilder:validation:Required | ||
PoolRef InferencePoolObjRef `json:"poolRef"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a pointer?
also, omitempty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pointers should only be used for optional fields (xref).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to clarify for myself. Optional fields in our apis have to have:
- a pointer type in the Go definition or have a built-in nil value (so maps and lists are fine, ie.
Priorities []Priority
json:"priorities,omitempty"` is okay). - omitempty set
How is the They have the +optional comment tag in Go.
used? https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#optional-vs-required Do we want to start adding the +optional
comment tag to non-required fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then for required fields, do we want to use // +kubebuilder:validation:Required
or just:
- Make the field not have the omitempty
- Make it not a pointer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's good to have our APIs follow k8s API conventions and use +optional for optional fields.
+kubebuilder:validation:Required can be used for required fields but just to be explicit since this is default behavior.
// +kubebuilder:default="inference.networking.x-k8s.io" | ||
// +kubebuilder:validation:MaxLength=253 | ||
// +kubebuilder:validation:Pattern=`^$|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$` | ||
Group string `json:"group,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this accept arbitrary gvks, or only InferencePools?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only InferencePools but it's idiomatic to use Group/Kind (with defaults) for xReference APIs. This approach also provides the option (if needed) to use a kgateway CRD to represent the Endpoint Picker extension.
Known issue that's being tracked in #10568. |
// this GatewayParameters. | ||
// | ||
// +kubebuilder:validation:Required | ||
PoolRef InferencePoolObjRef `json:"poolRef"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought InferencePool is meant to be used as a Backend on an http route? why do we need it in the gateway params?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The EndpointPicker
represents a single instance of the Endpoint Picker extension. The Gateway uses the PoolRef
to discover the extension's configuration, e.g. how to connect to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sure why this is on the gateway params? what's the difference between the PoolRef here and the one on the HttpRoute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #10601 (comment).
Yes, but the Helm install will be a separate PR. I wanted this PR to focus purely on the GatewayParameter changes required to introduce Inference Extension. |
5ed7c30
to
110b220
Compare
@yuval-k @jenshu @timflannagan I rebased and responded to your comments, PTAL. |
Marking as "draft" as I work through the initial implementation based on #10601 (comment). |
110b220
to
593313a
Compare
593313a
to
0c5bfa7
Compare
Signed-off-by: Daneyon Hansen <[email protected]>
0c5bfa7
to
698b433
Compare
/hold since #10676 does not require any input to enable inference extension functionality. Instead, the controller and deployer are enabled if the InferencePool CRD exists. |
Co-authored-by: Nathan Fudenberg <[email protected]> Co-authored-by: changelog-bot <changelog-bot>
Description
Introduces initial Gateway API Inference Extension support to GatewayParameters by adding a list of EndpointPicker extensions. Each EndpointPicker extension references an InferencePool that can be used as a
backendRef
for managed HTTPRoutes.Note: Until #10594 is resolved, the
inferenceExtension
field is only supported when GatewayParameters is referenced by a Gateway.Supports #10411
API changes
Adds InferenceExtension to KubernetesProxyConfig.
Code changes
All code changes are from
make generate-all
CI changes
N/A
Docs changes
The
generate-crd-reference-docs
target is broken so this PR doe snot include CRD doc changes. @artberger does this target now live in a separate repo?Context
See #10411 for additional context.
Interesting decisions
N/A
Testing steps
make go-test
Notes for reviewers
Checklist: