-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
utf-8 codec decoding error #22
Comments
It certainly doesn't look like the contents of your barcode are valid UTF-8: a lot of it appears to be horribly mangled.
It's mainly a bug in how the ZXing command-line runner (which this Python module relies on as a wrapper script!) handles its output: namely, it mangles bytes that aren't validly encoded… and the encoding it targets depends on the operating system being used. On Windows, this mostly works, because the default encoding can reversibly encode and decode unknown bytes, but on Linux it's UTF-8, where certain byte sequences are invalid, and they get mangled by the ZXing Java command-line-runner to an extent that the Python wrapper can't recover them. See #17 (comment), specifically points (2) and (3) for more thoughts:
One possible fix for this is #19. |
It's possible to use non-default encodings in PDF-17 (and other 2D barcodes like QR or DataMatrix), but you're at the mercy of having encoding and decoding and wrapping software that understands how to do this exactly right… and most of it doesn't. I'll make the immodest claim that I understand and care about this better than almost anyone in the world, and have contributed code to the Java ZXing library to improve the correctness of its handling of nonstandard character sets (see zxing/zxing#1328, zxing/zxing#1330), but I haven't had time to improve the CommandLineRunner's output. |
I'm coming here because of the same problem - trying to decode example barcodes from Deutsche Bahn (https://www.bahn.de/angebot/regio/barcode, there's a zip folder "Muster-Tickets nach UIC 918.9 (ZIP, 2 MB)"). The Aztec code contains no valid utf-8 - most of it is in zip format. So to process it, the library needs to return the bytes in the raw content to be useful here. Patching the
It would be useful if an optional parameter switches the raw result to bytes and turns off any parsing attempts. |
Hello,
I'm running a simple test, trying to read an PDF417 bar code. I'm getting this error:
Here is my code:
And here is the image to decode:
I used the zxing online decoder https://zxing.org/w/decode.jspx and it works fine, so I want to know if I'm doing something wrong or it is a bug.
Thanks,
The text was updated successfully, but these errors were encountered: