Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: add HW requirements calculator #1216

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ngxson
Copy link
Member

@ngxson ngxson commented Feb 21, 2025

To be discussed with @Vaibhavs10

Demo:

mem {
  "name": "hexgrad/Kokoro-82M",
  "memory": {
    "minimumGigabytes": 1.991212226,
    "recommendedGigabytes": 2.1903334486
  }
}
mem {
  "name": "microsoft/OmniParser-v2.0",
  "memory": {
    "minimumGigabytes": 1.664,
    "recommendedGigabytes": 1.8304
  }
}
mem {
  "name": "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B",
  "memory": {
    "minimumGigabytes": 5.4170575452,
    "recommendedGigabytes": 5.95876329972
  }
}
mem {
  "name": "deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
  "memory": {
    "minimumGigabytes": 20.424667624799998,
    "recommendedGigabytes": 22.467134387279998
  }
}
mem {
  "name": "NousResearch/DeepHermes-3-Llama-3-8B-Preview",
  "memory": {
    "minimumGigabytes": 20.4246676512,
    "recommendedGigabytes": 22.46713441632
  }
}
mem {
  "name": "unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit",
  "memory": {
    "minimumGigabytes": 8.309023701600001,
    "recommendedGigabytes": 9.139926071760001
  }
}

Copy link
Member

@Vaibhavs10 Vaibhavs10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I like this - some quick questions:

  1. How do we fetch the users hardware from their Local Apps page?
  2. Should we maybe start with just LLMs to begin with?

Quite excited to make this work!

cc: @julien-c for thoughts too

@ngxson
Copy link
Member Author

ngxson commented Feb 27, 2025

  • How do we fetch the users hardware from their Local Apps page?

On moon-landing, the data is exposed via FrontData.ts, so adding the check if model is compatible with a given user hardware should be simple.

The idea is that this getHardwareRequirements provides a reference of how much RAM is needed, it doesn't need to know the notion of user hw. Then on moon-landing frontend, we implement this check against user hw.

  • Should we maybe start with just LLMs to begin with?

Yes, absolutely agree!

Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we would focus on only RAM/VRAM in the beginning, correct? I agree with this approach, but i would maybe start with just GGUF, and later extend to other formats

/**
* The context size in tokens, default to 2048.
*/
contextSize?: number;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if possible contextSize should be taken in tasks.ModelData so it's clearer the data is coming from normalized parsing of HF model repos

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and tbh i'm wondering if this kinda of method should better live in tasks or gguf modules rather than here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the contextSize should be a number that most users gonna use in practice, for example 2048 or 4096

Setting it to the max contextSize of the model may not be useful, as most users never have enough VRAM to run full context length anyway (especially, 128k context are more and more common nowadays)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i see your point. cc @gary149 wdyt?
maybe it's another selector in the UI then with "user-set context size"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could set an educated default of 8k tho (claude was 8k only for a long time)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants