Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fdb decodes message with a system encoding while it`s encoded using server encoding #105

Open
nikto-b opened this issue Feb 3, 2022 · 4 comments

Comments

@nikto-b
Copy link

nikto-b commented Feb 3, 2022

How to reproduce:

  1. Run Firebird 2.0 under Windows 10 (default charset is CP1251)
  2. Run python3 under Linux (default charset is UTF-8) with fdb==2.0.2
  3. Run procedure that returns as an exception some cyrillic symbols
  4. See 'utf-8' codec can't decode byte 0xf2 in position 0: invalid continuation byte error

Stacktrace points into a fbcore.py:607
Probably, the solution can be to use a charset option from the connect method here but have no idea how to do this

@kmateusz186
Copy link

kmateusz186 commented May 11, 2022

I ran into the same problem, having WIN1250 charset when connecting to the database. I solved it by creating a global variable and overwriting it in the connect method. Having a global variable I used it in the exception_from_status method.
def exception_from_status(error, status, preamble=None):
.......
if PYTHON_MAJOR_VER == 3:
msglist.append('- ' + (msg.value).decode(GLOBAL_VAR_NAME))
I don't know if this is the best solution, but it works.

@iwkse
Copy link

iwkse commented Oct 13, 2022

I went though a similar bug, I've solved it adding the "replace" option to the decode function

@PracticallyNothing
Copy link

Apologies for bumping an old issue.

We've also had to deal with this problem in 2024.

We have both a Python backend and a Firebird 4 DB running on Linux. The database is encoded using cp1251/WIN1251 for legacy reasons, while the backend speaks UTF-8. All queries with text in WIN1251 are converted to UTF-8 without problems, since we've set the encoding for the database when creating the connection. However, any exceptions containing cyrillic characters raise decoding errors in Python.

We've held off on changing over to the new Python driver due to an issue with how BLOBs are handled and how that relates to the SQLAlchemy driver for Firebird.

I admit that we haven't tested whether this is actually the case, but having a look at the new driver's source code, it seems to also suffer from this issue, since it uses locale.getpreferredencoding() to determine how exceptions should be decoded.

The proposed solutions have some problems:

  • adding errors=replace to .decode() risks losing the information contained within the exception
  • setting a global variable for the encoding doesn't work if you connect to databases with different encodings

The solution we've found works best for our case is to use the same encoding as the connection to the database, since it's more likely that the database will also use that encoding for its exceptions.

This means that, in fdb/fbcore.py, we have to:

  1. add a new parameter to exception_from_status: encoding, and using it to decode the exception
591c591
< def exception_from_status(error, status, preamble=None):
---
> def exception_from_status(error, status, preamble=None, encoding=None):
607c607
<                 msglist.append('- ' + (msg.value).decode(sys_encoding))
---
>                 msglist.append("- " + (msg.value).decode(charset_map.get(encoding, encoding) or sys_encoding))
  1. find all the places where exception_from_status is called and provide a value for the new parameter

We do have a patch file for fixing this issue, which can be applied to fdb/fbcore.py. However I'm reluctant to turn it into a pull request, since we don't have any tests we can provide, and we aren't sure we found every place where this issue occurs.

@pcisar
Copy link
Contributor

pcisar commented Jul 5, 2024

Well, the core of this problem is that there could be error messages that are encoded in OS encoding at the server (path, filenames etc.). In your case it happens to be the same as database encoding, so your solutions works fine for you, but fails for other cases. Hence I'm reluctant to adopt this approach. I agree that this should be configurable, best at connection level (both database and server). I'll see what I can do about that, but I'll first fix that in firebird-driver as it's more easy with its separate configuration scheme. I'll see if something could be done with FDB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants