-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial Inference Extension Plugin #10684
base: main
Are you sure you want to change the base?
Conversation
Note: To pass CI, this PR includes a few identical commits as #10676 for go mod bump, code generations, etc. |
bd4d69b
to
b981485
Compare
}, krtOpts.ToOptions("EndpointPickerUpstreams")...) | ||
|
||
// Create the endpoints collection | ||
inputs := krtcollections.NewInfPoolEndpointsInputs(krtOpts, infPoolUpstream, pods) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you can move this collection to this package; the original was there because of how it worked in ggv2
out.CircuitBreakers = &envoy_config_cluster_v3.CircuitBreakers{ | ||
Thresholds: []*envoy_config_cluster_v3.CircuitBreakers_Thresholds{ | ||
{ | ||
MaxConnections: wrapperspb.UInt32(40000), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is 40000 coming from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UseHttpHeader: true, | ||
HttpHeaderName: "x-gateway-destination-endpoint", | ||
} | ||
anyLbConfig, err := anypb.New(lbConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use our utils.MessageToAny
to do data-plane anypb serialization, as they set deterministic to true
.
we should make sure other places in the code do that too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created #10693 to make this change in other areas of the code and will update in the endpointpicker plugin.
poolCfg.OurPool = func(pool *infextv1a1.InferencePool) bool { | ||
// List HTTPRoutes in the same namespace. | ||
var routes apiv1.HTTPRouteList | ||
if err := poolCfg.Mgr.GetClient().List(ctx, &routes, client.InNamespace(pool.Namespace)); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems that our onwership of inference pool is dynamic - i.e. if an httproute we own routes to it?
is this temporary until we pass in a controller field to the pool spec?
if this is the way we want to solve this going forward, let's make this function O(1) by using indexing (as i imagine would be called frequently)
For(&infextv1a1.InferencePool{}, builder.WithPredicates( | ||
predicate.NewPredicateFuncs(func(object client.Object) bool { | ||
if pool, ok := object.(*infextv1a1.InferencePool); ok { | ||
return c.poolCfg.OurPool(pool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OurPool is doing uses the client - i'm not sure you should do that (or any other I/O) from a predicate function. i believe there is no guarantee that the client semantics are kept when predicates are called.
Namespace: pool.Namespace, | ||
Name: pool.Name, | ||
}, | ||
Obj: pool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you need to include ObjIR
here
4fd238f
to
a8af675
Compare
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
a8af675
to
3539224
Compare
Signed-off-by: Daneyon Hansen <[email protected]>
3539224
to
215d19c
Compare
Description
Adds initial support for an inference extension endpoint picker plugin. The plugin will:
API changes
N/A
Code changes
CI changes
N/A
Docs changes
Godocs added throughout. User docs will be added in a future PR.
Context
Supports #10411
Interesting decisions
To keep the PR small, this is the first of multiple PRs to implement the endpoint picker plugin. This PR _does not create the ext-prc cluster nor does it add the ext-proc filter to the listeners filter chain.
Testing steps
Unit tests were added. e2e tests are still required and not included here due to the size of the PR.
Notes for reviewers
Checklist: