You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If our test asks for images, I think we should make sure that the test-case is actually providing the images/arrays (rather than requiring the LLM to convert the input into images first)
Here's the sum_images test. The problem is that it's asking for images, but our test function is actually checking lists.
This is just a sample, there may be more places where this is the case. Fixing these issues brings these (below 50%) benchmarks much more in line with our expectations.
Self-assigning this one.
The text was updated successfully, but these errors were encountered:
ian-coccimiglio
changed the title
Test-cases asking for images but check lists
[Fixing Test-Cases] Functions with image inputs but checking lists
Sep 9, 2024
If our test asks for images, I think we should make sure that the test-case is actually providing the images/arrays (rather than requiring the LLM to convert the input into images first)
Here's the sum_images test. The problem is that it's asking for images, but our test function is actually checking lists.
This also applies to the mask_images test:
As well as mean squared error test:
This is just a sample, there may be more places where this is the case. Fixing these issues brings these (below 50%) benchmarks much more in line with our expectations.
Self-assigning this one.
The text was updated successfully, but these errors were encountered: