Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds InferenceExtension to GatewayParameters #10601

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

danehans
Copy link
Contributor

Description

Introduces initial Gateway API Inference Extension support to GatewayParameters by adding a list of EndpointPicker extensions. Each EndpointPicker extension references an InferencePool that can be used as a backendRef for managed HTTPRoutes.

Note: Until #10594 is resolved, the inferenceExtension field is only supported when GatewayParameters is referenced by a Gateway.

Supports #10411

API changes

Adds InferenceExtension to KubernetesProxyConfig.

Code changes

All code changes are from make generate-all

CI changes

N/A

Docs changes

The generate-crd-reference-docs target is broken so this PR doe snot include CRD doc changes. @artberger does this target now live in a separate repo?

Context

See #10411 for additional context.

Interesting decisions

N/A

Testing steps

make go-test

Notes for reviewers

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works

// InferenceExtension defines the desired state of the Gateway API inference extension.
// For additional details, see: https://gateway-api-inference-extension.sigs.k8s.io/.
//
// InferenceExtension can only be specified when GatewayParameters is referenced by a Gateway.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: The inferenceExtension field is only supported when the GatewayParameters is referenced by a Gateway until #10594 is resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we specify on a gateway class, why can't a gateway "inherit" it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuval-k thanks for the review and feedback. Since multiple Gateways can reference a GatewayClass the EndpointPickerExtension API would need to associate a Gateway to an InferencePool. This can be achieved using a list map keyed on the Gateway namespaced name. For example:

type InferenceExtension struct {
        // +listType=map
        // +listMapKey=gatewayNsName
	// +kubebuilder:validation:MaxItems=8
	EndpointPickers []EndpointPickerExtension `json:"endpointPickers"`
}

Each EndpointPickerExtension must now define gatewayNsName: <GW_NS>/<GW_NAME>. This is acceptable when GatewayParameters is referenced by a GatewayClass but is overly verbose when GatewayParameters is referenced by a Gateway.

Copy link
Contributor

@yuval-k yuval-k Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to associate Gateway to an InferencePool? i thought InferencePool represents the upstream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An extension is associated to an InferencePool:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: inference-gateway-ext-proc
...
      containers:
      - name: inference-gateway-ext-proc
        image: us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/epp:main
        args:
        - -poolName
        - "test-infer-pool"

InferencePool does represent the upstream. The implementation can reconcile HTTPRoutes with an InferencePool backendRef and use the connection configuration from the InferencePool to configure the parentRefs Gateways. I was trying to be explicit with the initial implementation, e.g. associate an extension with a Gateway. I'll work through the implementation without the additional verbosity and see if I hit any roadblocks.

@@ -612,6 +620,52 @@ type AiExtension struct {
Stats *AiExtensionStats `json:"stats,omitempty"`
}

// InferenceExtension defines the desired state of the Gateway API inference extension.
// For additional details, see: https://gateway-api-inference-extension.sigs.k8s.io/.
type InferenceExtension struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Additional extensions will be introduced in future releases of Gateway API inference extension.

@timflannagan
Copy link
Member

Do we need to add this field to the default GW params object in install/helm/kgateway?

// EndpointPickers defines a list of EndpointPicker extensions.
//
// +kubebuilder:validation:MaxItems=8
EndpointPickers []EndpointPickerExtension `json:"endpointPickers"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omitempty?

Copy link
Contributor Author

@danehans danehans Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omitempty should only be used with optional fields and this field is required (xref).

// this GatewayParameters.
//
// +kubebuilder:validation:Required
PoolRef InferencePoolObjRef `json:"poolRef"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a pointer?

also, omitempty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pointers should only be used for optional fields (xref).

Copy link
Contributor

@npolshakova npolshakova Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to clarify for myself. Optional fields in our apis have to have:

  • a pointer type in the Go definition or have a built-in nil value (so maps and lists are fine, ie. Priorities []Priority json:"priorities,omitempty"` is okay).
  • omitempty set

How is the They have the +optional comment tag in Go. used? https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#optional-vs-required Do we want to start adding the +optional comment tag to non-required fields?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then for required fields, do we want to use // +kubebuilder:validation:Required or just:

  • Make the field not have the omitempty
  • Make it not a pointer

Copy link
Contributor Author

@danehans danehans Feb 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's good to have our APIs follow k8s API conventions and use +optional for optional fields.

+kubebuilder:validation:Required can be used for required fields but just to be explicit since this is default behavior.

// +kubebuilder:default="inference.networking.x-k8s.io"
// +kubebuilder:validation:MaxLength=253
// +kubebuilder:validation:Pattern=`^$|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$`
Group string `json:"group,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this accept arbitrary gvks, or only InferencePools?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only InferencePools but it's idiomatic to use Group/Kind (with defaults) for xReference APIs. This approach also provides the option (if needed) to use a kgateway CRD to represent the Endpoint Picker extension.

@timflannagan
Copy link
Member

The generate-crd-reference-docs target is broken so this PR doe snot include CRD doc changes. @artberger does this target now live in a separate repo?

Known issue that's being tracked in #10568.

// this GatewayParameters.
//
// +kubebuilder:validation:Required
PoolRef InferencePoolObjRef `json:"poolRef"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought InferencePool is meant to be used as a Backend on an http route? why do we need it in the gateway params?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EndpointPicker represents a single instance of the Endpoint Picker extension. The Gateway uses the PoolRef to discover the extension's configuration, e.g. how to connect to it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not sure why this is on the gateway params? what's the difference between the PoolRef here and the one on the HttpRoute?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danehans
Copy link
Contributor Author

Do we need to add this field to the default GW params object in install/helm/kgateway?

Yes, but the Helm install will be a separate PR. I wanted this PR to focus purely on the GatewayParameter changes required to introduce Inference Extension.

@danehans
Copy link
Contributor Author

@yuval-k @jenshu @timflannagan I rebased and responded to your comments, PTAL.

@danehans
Copy link
Contributor Author

Marking as "draft" as I work through the initial implementation based on #10601 (comment).

@danehans danehans marked this pull request as draft February 12, 2025 17:18
@danehans danehans marked this pull request as ready for review February 18, 2025 18:41
@danehans
Copy link
Contributor Author

@yuval-k commit 593313a refactors the API to enable/disable the EPP extension.

@danehans
Copy link
Contributor Author

/hold since #10676 does not require any input to enable inference extension functionality. Instead, the controller and deployer are enabled if the InferencePool CRD exists.

@danehans danehans marked this pull request as draft February 22, 2025 19:34
stevenctl pushed a commit to stevenctl/gloo that referenced this pull request Feb 24, 2025
Co-authored-by: Nathan Fudenberg <[email protected]>
Co-authored-by: changelog-bot <changelog-bot>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants