Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial support for inference extension deployer #10676

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

danehans
Copy link
Contributor

@danehans danehans commented Feb 21, 2025

Description

Adds initial support for an InferencePool controller and deployer. This controller and deployer are independent of a gateway deployer. For now, both share the GatewayConfig type since refactoring this type would further increase the size of this PR.

API changes

N/A

Code changes

Controller, deployer, and helm pkgs.

CI changes

N/A

Docs changes

Godocs added throughout. User docs will be added in a future PR.

Context

Supports #10411

Interesting decisions

This PR Implements a separate deployer since an InferencePool (and the supporting infra resources) do not depend on a Gateway. Instead, a deployed Gateway will use config from an InferencePool to learn how to connect to it, what failure mode to use, etc.

Testing steps

Unit and integration tests were added. e2e tests are still required and not included here due to the size of the PR.

Notes for reviewers

Refer to the upstream docs for add'l context.

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works

@danehans
Copy link
Contributor Author

I don't understand why this PR required a bunch of code generation.

@danehans danehans force-pushed the issue_10411_epp_deployer branch 2 times, most recently from e316016 to ede2a66 Compare February 25, 2025 16:09
@danehans danehans force-pushed the issue_10411_epp_deployer branch from ede2a66 to 4ac2962 Compare February 25, 2025 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant