fitz - pdf to image conversion - some text characters are getting converted to junk #1627
-
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Thanks for reporting this so well prepared. I also received the material via e-mail. One could argue that most viewers still render the page ok, so why does MuPDF rendering not do this? The font marked yellow for |
Beta Was this translation helpful? Give feedback.
Thanks for reporting this so well prepared. I also received the material via e-mail.
Unfortunately I cannot do anything about this, because it is an upstream (MuPDF) issue. Your files are created in the wrong way:
They are using non-embedded fonts like Times Roman, but use
Identity
encoding instead of e.g.WinAnsiEncoding
. Font usingIdentity-H
encoding must be embedded, however.This problem is exhibited by any PDF viewer if you try to copy / paste those problematic text portions: it will copy garbage.
One could argue that most viewers still render the page ok, so why does MuPDF rendering not do this?
This I cannot answer, so I must refer you to MuPDF's bug tracker https://bugs.ghostscri…