GitHub - alexandrainst/alexandra_ai

Name	Name	Last commit message	Last commit date
Latest commit History 462 Commits
.github	.github
docs	docs
gfx	gfx
models	models
notebooks	notebooks
src	src
tests	tests
.flake8	.flake8
.gitignore	.gitignore
.pre-commit-config.yaml	.pre-commit-config.yaml
CHANGELOG.md	CHANGELOG.md
CODE_OF_CONDUCT.md	CODE_OF_CONDUCT.md
CONTRIBUTING.md	CONTRIBUTING.md
LICENSE	LICENSE
README.md	README.md
makefile	makefile
poetry.lock	poetry.lock
poetry.toml	poetry.toml
pyproject.toml	pyproject.toml

Evaluation of finetuned models.

Developers:

Dan Saattrup Nielsen ([email protected])
Anders Jess Pedersen ([email protected])

Installation

To install the package simply write the following command in your favorite terminal:

$ pip install aiai-eval

Quickstart

Benchmarking from the Command Line

The easiest way to benchmark pretrained models is via the command line interface. After having installed the package, you can benchmark your favorite model like so:

$ evaluate --model-id <model_id> --task <task>

Here model_id is the HuggingFace model ID, which can be found on the HuggingFace Hub, and task is the task you want to benchmark the model on, such as "ner" for named entity recognition. See all options by typing

$ evaluate --help

The specific model version to use can also be added after the suffix '@':

$ evaluate --model_id <model_id>@<commit>

It can be a branch name, a tag name, or a commit id. It defaults to 'main' for latest.

Multiple models and tasks can be specified by just attaching multiple arguments. Here is an example with two models:

$ evaluate --model_id <model_id1> --model_id <model_id2> --task ner

See all the arguments and options available for the evaluate command by typing

$ evaluate --help

Benchmarking from a Script

In a script, the syntax is similar to the command line interface. You simply initialise an object of the Evaluator class, and call this evaluate object with your favorite models and/or datasets:

>>> from aiai_eval import Evaluator
>>> evaluator = Evaluator()
>>> evaluator('<model_id>', '<task>')

Project structure

.
├── .flake8
├── .github
│   └── workflows
│       ├── ci.yaml
│       └── docs.yaml
├── .gitignore
├── .pre-commit-config.yaml
├── LICENSE
├── README.md
├── gfx
│   └── aiai-eval-logo.png
├── makefile
├── models
├── notebooks
├── poetry.toml
├── pyproject.toml
├── src
│   ├── aiai_eval
│   │   ├── __init__.py
│   │   ├── automatic_speech_recognition.py
│   │   ├── cli.py
│   │   ├── co2.py
│   │   ├── config.py
│   │   ├── country_codes.py
│   │   ├── evaluator.py
│   │   ├── exceptions.py
│   │   ├── hf_hub.py
│   │   ├── image_to_text.py
│   │   ├── named_entity_recognition.py
│   │   ├── question_answering.py
│   │   ├── scoring.py
│   │   ├── task.py
│   │   ├── task_configs.py
│   │   ├── task_factory.py
│   │   ├── text_classification.py
│   │   └── utils.py
│   └── scripts
│       ├── fix_dot_env_file.py
│       └── versioning.py
└── tests
    ├── __init__.py
    ├── conftest.py
    ├── test_cli.py
    ├── test_co2.py
    ├── test_config.py
    ├── test_country_codes.py
    ├── test_evaluator.py
    ├── test_exceptions.py
    ├── test_hf_hub.py
    ├── test_image_to_text.py
    ├── test_named_entity_recognition.py
    ├── test_question_answering.py
    ├── test_scoring.py
    ├── test_task.py
    ├── test_task_configs.py
    ├── test_task_factory.py
    ├── test_text_classification.py
    └── test_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evaluation of finetuned models.

Installation

Quickstart

Benchmarking from the Command Line

Benchmarking from a Script

Project structure

About

Releases 2

Contributors 5

Languages

License

alexandrainst/alexandra_ai_eval

Folders and files

Latest commit

History

Repository files navigation

Evaluation of finetuned models.

Installation

Quickstart

Benchmarking from the Command Line

Benchmarking from a Script

Project structure

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 2

Contributors 5

Languages