Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support LLaVA-UHD #6153

Closed
choyakawa opened this issue Mar 19, 2024 · 2 comments
Closed

Support LLaVA-UHD #6153

choyakawa opened this issue Mar 19, 2024 · 2 comments
Labels
enhancement New feature or request stale

Comments

@choyakawa
Copy link

https://github.com/thunlp/LLaVA-UHD

This method is seemingly on par with or better than LLaVA 1.6 Next, however they opensourced the training code for reproduction.

LLM analysis from Gemini 1.5 pro:

Feature LLaVA-UHD-13B LLaVA-NeXT-7B LLaVA-NeXT-13B LLaVA-NeXT-34B LLaVA 1.5-13B
VQAv2 81.7 81.8 (Vicuna) / 82.2 (Mistral) 82.8 83.7 80
GQA 65.2 64.2 (Vicuna) / 64.8 (Mistral) 65.4 67.1 63.3
TextVQA 67.7 64.9 (Vicuna) / 65.7 (Mistral) 67.1 69.5 61.3
ScienceQA 72 70.1 (Vicuna) / 72.8 (Mistral) 73.6 81.8 71.6
VizWiz 56.1 57.6 (Vicuna) / 60.0 (Mistral) 60.5 63.8 53.6
MMU (val) 36.4 35.8 (Vicuna) / 35.3 (Mistral) 36.2 51.1 36.4
MMU (test) 33.6 - - 44.7 33.6
MME 1535 1519 (Vicuna) / 1498 (Mistral) 1575 1631 1531
POPE 89.1 86.5 (Vicuna) / 86.7 (Mistral) 86.2 87.7 85.9

Observations:

  • LLaVA-UHD generally performs better than LLaVA 1.5 across all metrics.
  • LLaVA-NeXT series shows comparable performance to LLaVA-UHD on most tasks, with slight variations depending on the specific model (Vicuna or Mistral).
  • LLaVA-NeXT-34B stands out with significantly higher performance on ScienceQA and MMU tasks.

Originally posted by @choyakawa in thunlp/LLaVA-UHD#1 (comment)

@choyakawa choyakawa added the enhancement New feature or request label Mar 19, 2024
@choyakawa
Copy link
Author

Moreover, the model can be efficiently trained in academic settings, within 23 hours on 8 A100 GPUs (vs. 26 hours of LLaVA-1.5).

@github-actions github-actions bot added the stale label Apr 19, 2024
Copy link
Contributor

github-actions bot commented May 3, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as completed May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

No branches or pull requests

1 participant