-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wip: add HW requirements calculator #1216
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I like this - some quick questions:
- How do we fetch the users hardware from their Local Apps page?
- Should we maybe start with just LLMs to begin with?
Quite excited to make this work!
cc: @julien-c for thoughts too
On moon-landing, the data is exposed via The idea is that this
Yes, absolutely agree! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we would focus on only RAM/VRAM in the beginning, correct? I agree with this approach, but i would maybe start with just GGUF, and later extend to other formats
/** | ||
* The context size in tokens, default to 2048. | ||
*/ | ||
contextSize?: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if possible contextSize
should be taken in tasks.ModelData
so it's clearer the data is coming from normalized parsing of HF model repos
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and tbh i'm wondering if this kinda of method should better live in tasks
or gguf
modules rather than here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO the contextSize
should be a number that most users gonna use in practice, for example 2048 or 4096
Setting it to the max contextSize
of the model may not be useful, as most users never have enough VRAM to run full context length anyway (especially, 128k context are more and more common nowadays)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, i see your point. cc @gary149 wdyt?
maybe it's another selector in the UI then with "user-set context size"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could set an educated default of 8k tho (claude was 8k only for a long time)
To be discussed with @Vaibhavs10
Demo: