Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

define "evaluator" interface #9

Open
Tracked by #3
keighrim opened this issue May 11, 2023 · 0 comments
Open
Tracked by #3

define "evaluator" interface #9

keighrim opened this issue May 11, 2023 · 0 comments
Milestone

Comments

@keighrim
Copy link
Member

(subtask of #3)

We'd like to define a minimal, but concrete behavior of the class of "evaluator". Some features are also discussed in clamsproject/aapb-annotations#2 (comment). At the very minimum, an "evaluator" should be able

  1. take a batch of gold and a batch predictions and return a single HTML file with the evaluation result
  2. take batches of gold and batches of predictions and return a single HTML file with all the evaluation results and aggregated result.

Gold files are freely accessible from the https://github.com/clamsproject/aapb-annotations repository, but predictions files almost always need to be generated on demand, and in many cases (vision, audio apps) generating predictions will take hours, if not days, even with a small size batch. But running CLAMS pipelines, waiting for the generation for predictions (MMIF), and finally obtaining those MMIF files should not be responsibility of evaluators, but instead the evaluation "runner" or "invoker" should take charge of obtaining all golds and preds files before an evaluator runs.

@clams-bot clams-bot added this to infra May 11, 2023
@github-project-automation github-project-automation bot moved this to Todo in infra May 11, 2023
@keighrim keighrim added this to the eval-v1 milestone Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

1 participant