Benchmarking o1-preview-2024-09-12 #135

haesleinhuepf · 2024-09-12T18:10:02Z

If anyone knows anyone who is tier 5 in openai (@royerloic maybe ?), they could benchmark the new o1 model. I am just tier3 and have to wait...

jkh1 · 2024-09-13T07:04:46Z

It seems that this model has hidden tokens that you still get billed for (see note in the docs here: https://platform.openai.com/docs/guides/reasoning/how-reasoning-works) which may explain why it's for tier 5 users 😃
This could become an expensive experiment.

haesleinhuepf · 2024-09-13T07:22:52Z

Yeah I know. I think claude does something similar. That's why I'm curious if it's really so much better than the models we tested so far.

haesleinhuepf · 2024-09-24T08:44:46Z

Ok, I have access now. Just FYI: My first 12 prompts cost $6.68 for o1-preview, hence running the entire benchmark would cost about $300. (updated)

haesleinhuepf mentioned this issue Sep 25, 2024

O1 mini #137

Merged

9 tasks

Provide feedback