Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling ValidationError in dspy when LLM output mismatches pydantic model #2148

Open
hucorz opened this issue Jan 19, 2025 · 2 comments
Open

Comments

@hucorz
Copy link

hucorz commented Jan 19, 2025

I’m encountering pydantic.ValidationError in dspy because the LLM output does not conform to the structure defined in my signature.

  1. How do you typically handle cases where LLM output does not match the expected model structure?

  2. Is there a way to bypass the ValidationError and return the raw LLM output for further inspection or processing?

@AbhishekRP2002
Copy link

Ideally, if the LLM output fails to conform with the pydantic class structure or JSON obj signature, then we should have a custom parser that intakes raw LLM output and parses into structured format using regex or some parsing logic.

langchain follows this implementation:

https://python.langchain.com/api_reference/_modules/langchain_core/utils/json.html#parse_json_markdown

@chenmoneygithub
Copy link
Collaborator

@hucorz Thanks for reporting the issue! could you share a reproducible code?

@AbhishekRP2002 A best-effort parsing could make sense here, but I would like to understand the ratio of successful custom parsing on responses that fail the automatic parsing. What we believe is if it is not the case right now, in the near future, LLM can produce pretty reliable structured output given the right prompt, so we don't need a regex parser as fallback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants