You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Notebook flow showcasing the e2e flow of how we can do evals for RAG agents.
This will merge a lot of recent work around -- eval apis, agent updates, RAG updates, ReACT agents, etc
Local file with some benchmarking data
Upload file using /files
Make a dataset out of it
Make a benchmark ( Register this dataset via /benchmarks )
Run RAG agent Eval /eval apis
This RAG agent should have all the core changes around no adhoc RAG calling, tool_option, system_prompt, etc
Make changes to the Agent prompt or give extra tool and re-eval
[stretch] Use docling for indexing / chunking to showcase improvements
💡 Why is this needed? What if we don't build it?
This showcases a first use case of the e2e SDLC of a RAG agent.
Other thoughts
No response
The text was updated successfully, but these errors were encountered:
🚀 Describe the new functionality needed
Notebook flow showcasing the e2e flow of how we can do evals for RAG agents.
This will merge a lot of recent work around -- eval apis, agent updates, RAG updates, ReACT agents, etc
/files
/benchmarks
)/eval
apis💡 Why is this needed? What if we don't build it?
This showcases a first use case of the e2e SDLC of a RAG agent.
Other thoughts
No response
The text was updated successfully, but these errors were encountered: