How can I send a PDF document like a invoice get its metadata? #199
-
I've working in the past with the ChatRequest to send and image to get the extracted text and works great, but when I try to send a PDF there is an error because only allows images: Invalid MIME type. Only image types are supported Indeed, the ChatRequest chatRequest = ChatRequest.builder()
.model("gpt-4o")
.messages(List.of(
ChatMessage.UserMessage.of(
List.of(
ContentPart.ContentPartText.of("extract metadata from this invoice"),
ContentPart.ContentPartImageUrl.of(loadImageAsBase64(mimeType, content))
)
)
))
.responseFormat(ResponseFormat.jsonSchema(ResponseFormat.JsonSchema.builder()
.name(InvoiceMetadata.class.getSimpleName())
.schemaClass(InvoiceMetadata.class)
.build()))
.build(); Any idea? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @monkiki, thanks for using simple-openai! You are talking about the Vision functionality, which is part of the Chat Completion API. According to OpenAI documentation, Vision supports the following type of files only: png, jpg, jpeg, webp, gif. So, you cannot use PDF files directly, however you could convert them (using external tools) to image files. The other option could be to use the Assistant API with File Search Tool, which accepts PDF files and you can ask questions about the content of those PDF files. I invite you to read the documentation about it and use simple-openai for that purpose. |
Beta Was this translation helpful? Give feedback.
Hi @monkiki, thanks for using simple-openai!
You are talking about the Vision functionality, which is part of the Chat Completion API. According to OpenAI documentation, Vision supports the following type of files only: png, jpg, jpeg, webp, gif. So, you cannot use PDF files directly, however you could convert them (using external tools) to image files.
The other option could be to use the Assistant API with File Search Tool, which accepts PDF files and you can ask questions about the content of those PDF files. I invite you to read the documentation about it and use simple-openai for that purpose.