Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature-request] Update vLLM library in LMI containers to v0.6.0 #4240

Open
CoolFish88 opened this issue Sep 16, 2024 · 2 comments
Open

[feature-request] Update vLLM library in LMI containers to v0.6.0 #4240

CoolFish88 opened this issue Sep 16, 2024 · 2 comments

Comments

@CoolFish88
Copy link

Concise Description:

vLLM v0.6.0 provides 2.7x throughput improvement and 5x latency reduction over the previous version (v0.5.3)

DLC image/dockerfile:
763104351884.dkr.ecr.us-west-2.amazonaws.com/djl-inference:0.29.0-lmi11.0.0-cu124
763104351884.dkr.ecr.us-west-2.amazonaws.com/djl-inference:0.29.0-neuronx-sdk2.19.1

Is your feature request related to a problem? Please describe.
Improve the performance of LMI containters

Describe the solution you'd like
Update vLLM library in LMI containers to v0.6.0

@siddvenk
Copy link
Contributor

siddvenk commented Oct 1, 2024

We are planning a release that will include vllm 0.6.2 within the next 2 weeks. In the meantime, you can try providing a requirements.txt with vllm==0.6.x and leverage a later version of vllm that way. If you go this route, you should also set OPTION_ROLLING_BATCH=vllm environment variable to force usage of vllm

@n0thing233
Copy link

n0thing233 commented Feb 11, 2025

@siddvenk,
are the dockerfile or image building process of LMI containers open-sourced?
Appreciate if you could share.

If not, I think I'll need to build an image on top of LMI image, am I correct? some dockerfile like this:

FROM 763104351884.dkr.ecr.us-west-2.amazonaws.com/djl-inference:0.31.0-lmi13.0.0-cu124

RUN pip install --upgrade vllm==0.7.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants