-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve readability of the quick tour. #501
Open
vxw3t8fhjsdkghvbdifuk
wants to merge
8
commits into
huggingface:main
Choose a base branch
from
vxw3t8fhjsdkghvbdifuk:patch-2
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 6 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
d5feb23
Improve readability of the quick tour.
vxw3t8fhjsdkghvbdifuk b3d878c
update based on feedback
vxw3t8fhjsdkghvbdifuk 6df1aad
delete superfluous edit of float16
vxw3t8fhjsdkghvbdifuk dfd2a40
deleted , for no reason
vxw3t8fhjsdkghvbdifuk 6045d9d
Merge branch 'main' into patch-2
vxw3t8fhjsdkghvbdifuk 02b3e30
reorganize headers
vxw3t8fhjsdkghvbdifuk 6590cdb
fix nit
vxw3t8fhjsdkghvbdifuk 381beb2
closing bracket
vxw3t8fhjsdkghvbdifuk File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,34 +20,51 @@ Lighteval can be used with a few different commands. | |
- `tgi`: evaluate models on one or more GPUs using [🔗 Text Generation Inference](https://huggingface.co/docs/text-generation-inference/en/index) | ||
- `openai`: evaluate models on one or more GPUs using [🔗 OpenAI API](https://platform.openai.com/) | ||
|
||
## Accelerate | ||
## Basic usage | ||
|
||
### Evaluate a model on a GPU | ||
|
||
To evaluate `GPT-2` on the Truthful QA benchmark, run: | ||
To evaluate `GPT-2` on the Truthful QA benchmark with [🤗 | ||
Accelerate](https://github.com/huggingface/accelerate) , run: | ||
|
||
```bash | ||
lighteval accelerate \ | ||
"pretrained=gpt2" \ | ||
"leaderboard|truthfulqa:mc|0|0" | ||
``` | ||
|
||
Here, `--tasks` refers to either a comma-separated list of supported tasks from | ||
the [tasks_list](available-tasks) in the format: | ||
Here, we first choose a backend (either `accelerate`, `nanotron`, or `vllm`), and then specify the model and task(s) to run. | ||
|
||
```bash | ||
{suite}|{task}|{num_few_shot}|{0 or 1 to automatically reduce `num_few_shot` if prompt is too long} | ||
The syntax for the model arguments is `key1=value1,key2=value2,etc`. | ||
Valid key-value pairs correspond with the backend configuration, and are detailed [below](#Model Arguments). | ||
|
||
The syntax for the task specification might be a bit hard to grasp at first. The format is as follows: | ||
|
||
```txt | ||
{suite}|{task}|{num_few_shot}|{0 for strict `num_few_shots`, or 1 to allow a reduction} | ||
``` | ||
|
||
or a file path like | ||
[examples/tasks/recommended_set.txt](https://github.com/huggingface/lighteval/blob/main/examples/tasks/recommended_set.txt) | ||
which specifies multiple task configurations. | ||
If the fourth value is set to 1, lighteval will check if the prompt (including the few-shot examples) is too long for the context size of the task or the model. | ||
If so, the number of few shot examples is automatically reduced. | ||
|
||
Tasks details can be found in the | ||
All officially supported tasks can be found at the [tasks_list](available-tasks). | ||
Moreover, community-provided tasks can be found in the | ||
[extended folder](https://github.com/huggingface/lighteval/tree/main/src/lighteval/tasks/extended) and the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Extended are not community provided but maintainer provided, however they are tasks which require added logic (such as an LLM as judge, or redefine new metrics like IFEval) |
||
[community](https://github.com/huggingface/lighteval/tree/main/community_tasks) folder. | ||
For more details on the implementation of the tasks, such as how prompts are constructed, or which metrics are used, you can have a look at the | ||
[file](https://github.com/huggingface/lighteval/blob/main/src/lighteval/tasks/default_tasks.py) | ||
implementing them. | ||
|
||
### Evaluate a model on one or more GPUs | ||
Running multiple tasks is supported, either with a comma-separated list, or by specifying a file path. | ||
The file should be structured like [examples/tasks/recommended_set.txt](https://github.com/huggingface/lighteval/blob/main/examples/tasks/recommended_set.txt). | ||
When specifying a path to file, it should start with `./`. | ||
|
||
```bash | ||
lighteval accelerate \ | ||
"pretrained=gpt2" \ | ||
./path/to/lighteval/examples/tasks/recommended_set.txt | ||
# or, e.g., "leaderboard|truthfulqa:mc|0|0|,leaderboard|gsm8k|3|1" | ||
``` | ||
|
||
## Evaluate a model on one or more GPUs | ||
|
||
#### Data parallelism | ||
|
||
|
@@ -86,13 +103,13 @@ This will automatically use accelerate to distribute the model across the GPUs. | |
> `model_parallel=True` and using accelerate to distribute the data across the | ||
GPUs. | ||
|
||
### Model Arguments | ||
## Backend configuration | ||
|
||
The `model-args` argument takes a string representing a list of model | ||
argument. The arguments allowed vary depending on the backend you use (vllm or | ||
accelerate). | ||
|
||
#### Accelerate | ||
### Accelerate | ||
|
||
- **pretrained** (str): | ||
HuggingFace Hub model ID name or the path to a pre-trained | ||
|
@@ -128,7 +145,7 @@ accelerate). | |
- **trust_remote_code** (bool): Whether to trust remote code during model | ||
loading. | ||
|
||
#### VLLM | ||
### VLLM | ||
|
||
- **pretrained** (str): HuggingFace Hub model ID name or the path to a pre-trained model to load. | ||
- **gpu_memory_utilisation** (float): The fraction of GPU memory to use. | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.