-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The "Value" type in score is overly restrictive for dictonaries #1333
Comments
More generally, things crash when you pass a ditctonary like that I think! (dictonaries being returned by scorers doesn't actually seem supported in inspect, things need to be jammed in metadata) |
I don't see an easy way around this, though @jjallaire may be able to point me at something.
This is surprising and in my local testing they are well supported. Here is a sample scorer that I'm running without any issues (that I can see): @scorer(metrics={"*": [accuracy(), stderr()]})
def dict_scorer():
async def score(state: TaskState, target: Target):
# check for correct
answer = state.output.completion
# Calculate scores for each rubric
rubrics = ["rube-1", "rube-2", "rube-3", "rube-4"]
scores_by_rubric: dict[str, Union[str, int, float, bool, None]] = {}
for rubric in rubrics:
score = 0.0 # Placeholder score
scores_by_rubric[rubric] = score
# return score
return Score(
value = scores_by_rubric,
answer=answer,
metadata={
"rubrics": rubrics,
"rationale": "Placeholder rationale"
}
)
return score There may be some difference in our approaches that I'm not accounting for, so if you have a repro for that it is something we'd like to fix! |
Interesting! Maybe I'm doing something else wrong - will keep testing and get back to you. |
This came up for me when I was implementing simpleqa in inspect_evals. I needed to return a dictionary from the scorer which was fine but it resulted in some pretty clunky code in the metric
Because |
The "Value" type in inspect is defined as:
inspect_ai/src/inspect_ai/scorer/_metric.py
Lines 43 to 47 in a31fcc6
Presumably with the intention to allow something like
dict[str,float]
etc to be passed as a value. However, the type of dictonary values types not covariant, hencedict[str,float]
is not valid instance ofdict[str, str | int | float | bool | None]
. To get things to type you have do do awkward things like this to make things type:The text was updated successfully, but these errors were encountered: