Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source text doesn't equal text after encoding and decoding #9

Open
mikevolgo opened this issue Apr 24, 2024 · 3 comments
Open

Source text doesn't equal text after encoding and decoding #9

mikevolgo opened this issue Apr 24, 2024 · 3 comments

Comments

@mikevolgo
Copy link

Hi,

I've found a situation when after encoding and decoding text is not equal to source text. Example

text = 'ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑܧ¿abcdefghijklmnopqrstuvwxyzäöñüà'
print(codec.decode(codec.encode(text)) == text)
False
print(codec.decode(codec.encode(text)))
ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑܧ¿abcdefghijklmnopqrstuvwxyzäöñüà@

we can see that after encoding/decoding an extra symbol "@" is added.

@sacha-senchuk
Copy link
Owner

Hi,

Thanks for the report.

Do you already have an idea why this might happen?

@sacha-senchuk
Copy link
Owner

sacha-senchuk commented May 10, 2024

It seems like you have been using the GSM encoding.

There is a caveat that requires padding in certain situations:

https://github.com/qotto/smspdudecoder/blob/master/smspdudecoder/codecs.py#L87

In your case, you should consider using the following code:

from smspdudecoder.codecs import GSM

text = 'ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑܧ¿abcdefghijklmnopqrstuvwxyzäöñüà'

assert GSM.decode(GSM.encode(text, with_padding=True), strip_padding=True) == text

I probably need to create a new version of the package where padding is enabled by default, to be in-line with the GSM specifications:

image

@sacha-senchuk
Copy link
Owner

Hello, a quick update here.

This will be taken care of in the upcoming v3 of the library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants