Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Qwen 1.5B Inference Crash on Vertex AI Platform #144

Open
rothn opened this issue Feb 25, 2025 · 0 comments
Open

[BUG] Qwen 1.5B Inference Crash on Vertex AI Platform #144

rothn opened this issue Feb 25, 2025 · 0 comments

Comments

@rothn
Copy link

rothn commented Feb 25, 2025

I'm trying to deploy Qwen 1.5B on Vertex AI Endpoints, and I get a crash deploying Qwen 1.5B while Qwen 7B deploys perfectly fine, using the same HuggingFace TRL configuration (other than the base model) to train both. Note that training and local inference work fine both for 1.5B and 7B. The container I'm using is us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-cu124.2-4.ubuntu2204.py311. My requirements.txt file is as follows for the training / local-inference setup is as follows:

accelerate==1.4.0
deepspeed==0.16.3
importlib-metadata==8.6.1
transformers==4.49.0
trl @ git+https://github.com/huggingface/[email protected]
protobuf==5.29.3
sentencepiece==0.2.0

Logs from the container referenced above:
aiplatform_endpoints_crash.log

Container environment variables:

serving_container_environment_variables={
          "NUM_SHARD": "1",
          "MAX_INPUT_TOKENS": "512",
          "MAX_TOTAL_TOKENS": "1024",
          "MAX_BATCH_PREFILL_TOKENS": "1512",
          "CUDA_LAUNCH_BLOCKING": "1", # Debug for Qwen 1.5B
          "TORCH_USE_CUDA_DSA": "1",   # Debug for Qwen 1.5B
      }

I wonder if there's some sort of version mismatch here between the training and serving containers, or perhaps 2.4.0 is just too old/buggy, since the latest release of text-generation-inference appears to be 3.1.0. Is there a newer container I can try?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant