Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeepSeek V3 Does Not Support Structured Output in LangChain with ChatOpenAI() #29282

Open
5 tasks done
ksmooi opened this issue Jan 18, 2025 · 0 comments
Open
5 tasks done
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature investigate Flagged for investigation. Ɑ: models Related to LLMs or chat model modules

Comments

@ksmooi
Copy link

ksmooi commented Jan 18, 2025

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

Steps to Reproduce:

  1. Install the required libraries:

    !pip install -qU langchain-openai
    !pip install -qU langchain_community
    !pip install -qU langchain_experimental
    !pip install -qU langgraph
  2. Initialize the model and define a Pydantic model for structured output:

    from langchain_openai import ChatOpenAI
    from kaggle_secrets import UserSecretsClient
    from pydantic import BaseModel, Field
    
    llm_api_key = UserSecretsClient().get_secret("api-key-deepseek")
    model = ChatOpenAI(model="deepseek-chat", temperature=0, openai_api_key=llm_api_key, openai_api_base='https://api.deepseek.com')
    
    # Define a Pydantic model for structured output
    class Person(BaseModel):
        name: str = Field(description="The name of the person")
        age: int = Field(description="The age of the person")
        email: str = Field(description="The email address of the person")
    
    # Query the model (correct)
    response = model.invoke("Extract the name, age, and email of John Doe, who is 30 years old and has the email [email protected].")
    print(response)
  3. Use with_structured_output() to enforce the Pydantic model and query the model with structured output:

    structured_model = model.with_structured_output(Person)
    
    # Query the model with structured output (incorrect)
    response = structured_model.invoke("Extract the name, age, and email of John Doe, who is 30 years old and has the email [email protected].")
    print(response)

Error Message and Stack Trace (if applicable)

Actual Behavior:

The model throws an UnprocessableEntityError indicating that the response_format type json_schema is unavailable:

content='Here is the extracted information:\n\n- **Name**: John Doe  \n- **Age**: 30  \n- **Email**: [email protected]' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 30, 'prompt_tokens': 33, 'total_tokens': 63, 'completion_tokens_details': None, 'prompt_tokens_details': None, 'prompt_cache_hit_tokens': 0, 'prompt_cache_miss_tokens': 33}, 'model_name': 'deepseek-chat', 'system_fingerprint': 'fp_3a5770e1b4', 'finish_reason': 'stop', 'logprobs': None} id='run-d078cad9-42a0-4be0-9e92-a593002a8606-0' usage_metadata={'input_tokens': 33, 'output_tokens': 30, 'total_tokens': 63, 'input_token_details': {}, 'output_token_details': {}}

---------------------------------------------------------------------------
UnprocessableEntityError                  Traceback (most recent call last)
<ipython-input-14-83b3c0097ccc> in <cell line: 18>()
     16 
     17 # Query the model with structured output
---> 18 response = structured_model.invoke("Extract the name, age, and email of John Doe, who is 30 years old and has the email [email protected].")
     19 print(response)

/usr/local/lib/python3.10/dist-packages/langchain_core/runnables/base.py in invoke(self, input, config, **kwargs)
   3018                 context.run(_set_config_context, config)
   3019                 if i == 0:
-> 3020                     input = context.run(step.invoke, input, config, **kwargs)
   3021                 else:
   3022                     input = context.run(step.invoke, input, config)

/usr/local/lib/python3.10/dist-packages/langchain_core/runnables/base.py in invoke(self, input, config, **kwargs)
   5350         **kwargs: Optional[Any],
   5351     ) -> Output:
-> 5352         return self.bound.invoke(
   5353             input,
   5354             self._merge_configs(config),

/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py in invoke(self, input, config, stop, **kwargs)
    284         return cast(
    285             ChatGeneration,
--> 286             self.generate_prompt(
    287                 [self._convert_input(input)],
    288                 stop=stop,

/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py in generate_prompt(self, prompts, stop, callbacks, **kwargs)
    784     ) -> LLMResult:
    785         prompt_messages = [p.to_messages() for p in prompts]
--> 786         return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
    787 
    788     async def agenerate_prompt(

/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py in generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs)
    641                 if run_managers:
    642                     run_managers[i].on_llm_error(e, response=LLMResult(generations=[]))
--> 643                 raise e
    644         flattened_outputs = [
    645             LLMResult(generations=[res.generations], llm_output=res.llm_output)  # type: ignore[list-item]

/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py in generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs)
    631             try:
    632                 results.append(
--> 633                     self._generate_with_cache(
    634                         m,
    635                         stop=stop,

/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py in _generate_with_cache(self, messages, stop, run_manager, **kwargs)
    849         else:
    850             if inspect.signature(self._generate).parameters.get("run_manager"):
--> 851                 result = self._generate(
    852                     messages, stop=stop, run_manager=run_manager, **kwargs
    853                 )

/usr/local/lib/python3.10/dist-packages/langchain_openai/chat_models/base.py in _generate(self, messages, stop, run_manager, **kwargs)
    771             payload.pop("stream")
    772             try:
--> 773                 response = self.root_client.beta.chat.completions.parse(**payload)
    774             except openai.BadRequestError as e:
    775                 _handle_openai_bad_request(e)

/usr/local/lib/python3.10/dist-packages/openai/resources/beta/chat/completions.py in parse(self, messages, model, audio, response_format, frequency_penalty, function_call, functions, logit_bias, logprobs, max_completion_tokens, max_tokens, metadata, modalities, n, parallel_tool_calls, prediction, presence_penalty, reasoning_effort, seed, service_tier, stop, store, stream_options, temperature, tool_choice, tools, top_logprobs, top_p, user, extra_headers, extra_query, extra_body, timeout)
    158             )
    159 
--> 160         return self._post(
    161             "/chat/completions",
    162             body=maybe_transform(

/usr/local/lib/python3.10/dist-packages/openai/_base_client.py in post(self, path, cast_to, body, options, files, stream, stream_cls)
   1281             method="post", url=path, json_data=body, files=to_httpx_files(files), **options
   1282         )
-> 1283         return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
   1284 
   1285     def patch(

/usr/local/lib/python3.10/dist-packages/openai/_base_client.py in request(self, cast_to, options, remaining_retries, stream, stream_cls)
    958             retries_taken = 0
    959 
--> 960         return self._request(
    961             cast_to=cast_to,
    962             options=options,

/usr/local/lib/python3.10/dist-packages/openai/_base_client.py in _request(self, cast_to, options, retries_taken, stream, stream_cls)
   1062 
   1063             log.debug("Re-raising status error")
-> 1064             raise self._make_status_error_from_response(err.response) from None
   1065 
   1066         return self._process_response(

UnprocessableEntityError: Failed to deserialize the JSON body into the target type: response_format: response_format.type `json_schema` is unavailable now at line 1 column 626

Description

Description:

When using ChatOpenAI() with DeepSeek V3 in LangChain, the with_structured_output() method fails to enforce structured output formats (e.g., Pydantic models). The model returns an error indicating that the response_format type json_schema is unavailable. This prevents the use of structured output functionality, which is critical for applications requiring consistent and predictable data formats.

Expected Behavior:

The model should return a structured output in the format defined by the Pydantic model:

Person(name="John Doe", age=30, email="[email protected]")

System Info

Environment:

  • Python 3.x (Kaggle Notebook)
  • Libraries: langchain-openai, langchain-community, langchain_experimental, langgraph
  • Model: deepseek-chat (via DeepSeek API)
@langcarl langcarl bot added the investigate Flagged for investigation. label Jan 18, 2025
@dosubot dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Jan 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature investigate Flagged for investigation. Ɑ: models Related to LLMs or chat model modules
Projects
None yet
Development

No branches or pull requests

1 participant