RAG Eval Notebok #1113

hardikjshah · 2025-02-14T20:43:51Z

🚀 Describe the new functionality needed

Notebook flow showcasing the e2e flow of how we can do evals for RAG agents.
This will merge a lot of recent work around -- eval apis, agent updates, RAG updates, ReACT agents, etc

Local file with some benchmarking data
Upload file using /files
Make a dataset out of it
Make a benchmark ( Register this dataset via /benchmarks )
Run RAG agent Eval /eval apis
- This RAG agent should have all the core changes around no adhoc RAG calling, tool_option, system_prompt, etc
Make changes to the Agent prompt or give extra tool and re-eval
[stretch] Use docling for indexing / chunking to showcase improvements

💡 Why is this needed? What if we don't build it?

This showcases a first use case of the e2e SDLC of a RAG agent.

Other thoughts

No response

The text was updated successfully, but these errors were encountered:

hardikjshah added the enhancement New feature or request label Feb 14, 2025

hardikjshah added this to the v0.1.4 milestone Feb 14, 2025

yanxi0830 self-assigned this Feb 14, 2025

hardikjshah modified the milestones: v0.1.4, v0.1.5 Feb 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RAG Eval Notebok #1113

RAG Eval Notebok #1113

hardikjshah commented Feb 14, 2025

RAG Eval Notebok #1113

RAG Eval Notebok #1113

Comments

hardikjshah commented Feb 14, 2025

🚀 Describe the new functionality needed

💡 Why is this needed? What if we don't build it?

Other thoughts