-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a test case for a single dimension evaluation #123
Conversation
By the way currently I am using assert False (so the current pytest is definitely not passing) to see the output. However, I do not know how to check if there is a reasoning part. Does anyone have an idea? |
@XuhuiZhou Could you help check this? I think this is basically a prompting issue? Maybe by changing the description of the goal dimension, it should work better? |
Codecov ReportAll modified and coverable lines are covered by tests ✅
@@ Coverage Diff @@
## main #123 +/- ##
==========================================
+ Coverage 60.03% 62.01% +1.98%
==========================================
Files 47 55 +8
Lines 2402 2733 +331
==========================================
+ Hits 1442 1695 +253
- Misses 960 1038 +78
|
@bugsz Could you check if this fixes your problem? |
📑 Description
I provide a test case for the issue mentions in #89.
Specifically this is done by adding a dummy evaluator with only one
goal
evaluation dimension, and add a new option for theresponse_format
in evaluator.Besides, I use the same format as in real Sotopia simulation in testing, which makes the test case aligned with the actual evaluation.