Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请教, mPLUG-Owl3多卡推理! #233

Open
myg133 opened this issue Aug 16, 2024 · 2 comments
Open

请教, mPLUG-Owl3多卡推理! #233

myg133 opened this issue Aug 16, 2024 · 2 comments

Comments

@myg133
Copy link

myg133 commented Aug 16, 2024

我在一台 T4(16GB) x 4 服务器上部署推理,使用 gradio_demo.py 运行,会 out of memeory,运行不起来。

做了代码调整:
line 57:
model = AutoModel.from_pretrained(model_path, attn_implementation='sdpa', trust_remote_code=True,torch_dtype=torch.bfloat16, device_map="auto")
line 58:
# model = model.to(device=device)

可以跑起来,但是提问问题会报错:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:3 and cuda:0!

请教,有多卡推理的办法吗?还是小于16G的卡无法使用?

@LukeForeverYoung
Copy link
Collaborator

We do not have a development machine with multiple GPUs, so this scenario has not been fully tested. I suspect that the issue may be due to the visual features and hyper attention layers that can be assigned to different devices by the 'auto' device mapping. If this is the case, manually cloning the visual features to the same device as the current layers might resolve the issue.

@rookiez7
Copy link

We do not have a development machine with multiple GPUs, so this scenario has not been fully tested. I suspect that the issue may be due to the visual features and hyper attention layers that can be assigned to different devices by the 'auto' device mapping. If this is the case, manually cloning the visual features to the same device as the current layers might resolve the issue.

Due to the limited memory of the GPU, it is not possible to run on a single GPU, and multiple GPUs must be required.Can you give us an example how to multiple GPUs inference?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants