You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The text written in Japanese on the image is translated into English and output.
from markitdown import MarkItDown
from openai import OpenAI
client = OpenAI()
md = MarkItDown(llm_client=client, llm_model="gpt-4o")
result = md.convert("example.jpg") ### Japanese Language Image
print(result.text_content) ### English output
In some cases, the entire document will be in English, while in other cases only part of the document (only the title) will be in English.
Depending on the requirements of your RAG, this may not be desirable, so it is better to be able to specify the output language or to fix it to the original language found in the image.
The text was updated successfully, but these errors were encountered:
Hi, I was looking into this issue and couldn't find anything related to OCR in the code. Based on my understanding, the library processes the image by passing it to the provided LLM along with a prompt. If no custom prompt is given, it defaults to:
"Write a detailed caption for this image."
Since the prompt is in English, the LLM likely assumes the response should also be in English. This might explain why captions are always generated in English, even if the image contains text in another language.
A way to address this could be to allow users to specify a preferred language or check if the LLM itself supports automatic language detection and leveraging that if possible.
I’d love to work on this issue and implement a fix! Let me know if this approach makes sense or if you have any suggestions.
The text written in Japanese on the image is translated into English and output.
In some cases, the entire document will be in English, while in other cases only part of the document (only the title) will be in English.
Depending on the requirements of your RAG, this may not be desirable, so it is better to be able to specify the output language or to fix it to the original language found in the image.
The text was updated successfully, but these errors were encountered: