Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Render enums as Python IntEnum #8145

Merged
merged 1 commit into from
Jun 3, 2024

Conversation

fliiiix
Copy link
Contributor

@fliiiix fliiiix commented Nov 4, 2023

This allows enums to be type check with mypy.
They will still behave like ints ->

IntEnum is the same as Enum,
but its members are also integers and can be used anywhere
that an integer can be used.
If any integer operation is performed with an IntEnum member,
the resulting value loses its enumeration status.
https://docs.python.org/3/library/enum.html#enum.IntEnum

I think that should be fair enough since an enum is always a int:

Only integer types are allowed, i.e. byte, ubyte, short ushort, int, uint, long and ulong.

@github-actions github-actions bot added python c++ codegen Involving generating code from schema labels Nov 4, 2023
@fliiiix
Copy link
Contributor Author

fliiiix commented Nov 4, 2023

I got a bit lost on how ImportMapEntry works but as far as i can see its not required to be passed along, but if required i can refactor to pass along the import statement.

@akb825
Copy link
Contributor

akb825 commented Nov 7, 2023

To avoid compatibility issues this should probably only be enabled when --python-typing is set on the command line.

@fliiiix
Copy link
Contributor Author

fliiiix commented Nov 7, 2023

I thought about that, but this while helping with type checking is in my opinion the correct way to generate an enum in python,
the cpp code does also generate an enum type

enum Color {
  Color_Red = 0,
  Color_Green = 1,
  Color_Blue = 2,
  Color_MIN = Color_Red,
  Color_MAX = Color_Blue
};

and i could not come up with anyway this would break existing code

But if there good arguments against doing this by default im open to move this also behind the --python-typing flag.

@akb825
Copy link
Contributor

akb825 commented Nov 8, 2023

and i could not come up with anyway this would break existing code

Flatbuffers currently supports Python 2 (at least it's still advertised in setup.py), while Python enums were added in 3.4. You could question whether Python 2 support should be dropped given that it's EOL, which would be a question for the maintainers, but I'd imagine that breaking changes shouldn't be added until that officially happens.

the cpp code does also generate an enum type

It's worth noting that the default enums for C++ are the old C-style enums. With the --scoped-enum command-line option it creates the newer C++11 enum classes. In a lot of ways that's similar to the existing Python enum as constants in an object compared to using IntEnum.

@fliiiix
Copy link
Contributor Author

fliiiix commented Nov 10, 2023

@akb825 Fair enough, i missed the Python 2 support its behind the flag for now.

@akb825
Copy link
Contributor

akb825 commented Nov 11, 2023

Thanks! Don't forget to also update the generated code in the tests directory.

Another thought is that if you want to get the full benefit of type checking, you will probably also want to adjust the type annotations in the generated code (when --python-typing is set) for functions that take and return enum values to use the actual specific enum types rather than int. I'm not sure how difficult that would be given the current internal model to know the real type and manage the imports, and it may require some adjustments to the serialization code to assign the enum values appropriately.

Also, full disclosure: I'm not a maintainer and won't be able to approve this PR. The maintainers haven't been very active lately, so it may take some time to get it approved. (I myself noticed this PR when checking to see if there's been any activity lately since I'm waiting on getting a bugfix approved from late May...)

@fliiiix fliiiix force-pushed the feature/python-enum-typed branch from 91de368 to 841f9be Compare December 11, 2023 18:53
@fliiiix
Copy link
Contributor Author

fliiiix commented Jan 20, 2024

@dbaileychess what is missing to get this merged?

@fliiiix fliiiix force-pushed the feature/python-enum-typed branch from 841f9be to 1656ded Compare February 8, 2024 15:33
@fliiiix
Copy link
Contributor Author

fliiiix commented Feb 8, 2024

Rebased to latest master, anything else required to get this merged? @dbaileychess

@colinRawlings
Copy link

@dbaileychess Would be wonderful to see this go in

@fliiiix fliiiix force-pushed the feature/python-enum-typed branch from 48fb850 to dfd66a6 Compare March 28, 2024 07:33
@fliiiix fliiiix force-pushed the feature/python-enum-typed branch from d64ce3c to 8eb4508 Compare May 20, 2024 20:25
@fliiiix fliiiix force-pushed the feature/python-enum-typed branch 2 times, most recently from f1fcd60 to 5cb1e2f Compare June 1, 2024 15:09
@fliiiix
Copy link
Contributor Author

fliiiix commented Jun 1, 2024

Updated and rebased on @anton-bobukh changes - any chance this can be merged?

@anton-bobukh
Copy link
Collaborator

I've recently separated the generation of .pyi and .py files, and for this, you would need to update the stub generator: https://github.com/google/flatbuffers/blob/master/src/idl_gen_python.cpp#L591-L607.

Also, as pointed out above, enum is not a thing in Python2, and we should probably only use enum.IntEnum when the --python-version flag is set to Python3 only.

From what I know about enum.IntEnum, unlike C++ enums, it does not allow arbitrary values which will affect the backward-forward compatibility of the schema/generated code. For example, if the client adds a new enum value to the schema and sends it over to the server before the server is updated. In this case, we still want the server code to be able to handle the unknown value, not raise a ValueError.
One way to solve this would be to only update the stubs and keep the .py files unchanged, let the generated functions to still use ints.

@fliiiix
Copy link
Contributor Author

fliiiix commented Jun 1, 2024

for this, you would need to update the stub generator

good point i kinda assumed that should not be affected but that sounds wrong and i will look into it

and we should probably only use enum.IntEnum when the --python-version flag is set to Python3 only

It's behind the --python-typing. I don't understand the difference between this flag and the new --python-version flag
https://github.com/google/flatbuffers/pull/8145/files#diff-7c5e528085818515d85558785f93cc109a78c0c683d6fc2555fad55c09b5ee3eR679

At least my understanding is none of the typing stuff will work in Python 2.
For example combining --python-typing and selecting python2 will always give you invalid python2 code.

In this case, we still want the server code to be able to handle the unknown value, not raise a ValueError.

Can you explain to me how that currently works?
As far as I can tell this problem exists in exactly the same way with the old code
there is no way to add a value which does not exist

from enum import IntEnum

class Foo():
    Value1 = 1

class Bar(IntEnum):
    Value2 = 2

foo = Foo(3)
bar = Bar(3)

One way to solve this would be to only update the stubs and keep the .py files unchanged, let the generated functions to still use ints.

I will look into that 👍

@anton-bobukh
Copy link
Collaborator

My plan was to remove any type annotation from the generated .py file and fully rely on .pyi file for type checking, so --python-version controls the generation of .py files (e.g. doing class Fo(object): vs class Foo:), while -python-typing controls the generation of .pyi files.

I just realized that you don't change the way enums are read. Currently, we read them as integers (e.g.

def B(self): return self._tab.Get(flatbuffers.number_types.Int8Flags, self._tab.Pos + flatbuffers.number_types.UOffsetTFlags.py_type(8))
), and backward-forward compatibility will only be an issue if we wrap those integers into the corresponding enum class. Doing so will raise a ValueError for unknown values (e.g. when the a new enum value is added and the client is updated to send that new value before the server is updated to handle it).

This allows enums to be type check with mypy.
They will still behave like ints ->
> IntEnum is the same as Enum,
> but its members are also integers and can be used anywhere
> that an integer can be used.
> If any integer operation is performed with an IntEnum member,
> the resulting value loses its enumeration status.
https://docs.python.org/3/library/enum.html#enum.IntEnum

Only if the --python-typing flag is set.
@fliiiix fliiiix force-pushed the feature/python-enum-typed branch from 5cb1e2f to fc4efd6 Compare June 2, 2024 09:58
@fliiiix
Copy link
Contributor Author

fliiiix commented Jun 2, 2024

I just realized that you don't change the way enums are read.

I switched it now to put all this info in the .pyi files so i touch the enums even less.
This seems to be a relative simple and clean solution and in line with your plan, what do you think about this proposal?

@anton-bobukh anton-bobukh merged commit dafd2f1 into google:master Jun 3, 2024
49 checks passed
@bagring
Copy link

bagring commented Aug 8, 2024

Hi there,
please, can anyone explain to me, what is the point of annotating a type as (Int)Enum while the real type of the data remains "object"? Isn't it just a deliberate lie? The output type has remained "object" for looong time because of compatibility with Python 2. Since this commit, the new --python-version flag allows to forget about Python 2 for good.... if you just use this flag for the code generator as well (not just for the naming as the linked commit does).
I would like to tell the generator "yes, I do not bother about EOL products, hit me with the features of the Python3". The --python-version seemed to be a good candidate for this... until I discovered that it is currently used just to escape keywords.

EDIT: Type of the enum is actually "object".

@fliiiix
Copy link
Contributor Author

fliiiix commented Aug 9, 2024

Fair point with --python-version set to 3 and higher i guess we could render it as IntEnum instead of object. Putting this info in the typeshed for the type checker was good enough for my use case.

jochenparm pushed a commit to jochenparm/flatbuffers that referenced this pull request Oct 29, 2024
This allows enums to be type check with mypy.
They will still behave like ints ->
> IntEnum is the same as Enum,
> but its members are also integers and can be used anywhere
> that an integer can be used.
> If any integer operation is performed with an IntEnum member,
> the resulting value loses its enumeration status.
https://docs.python.org/3/library/enum.html#enum.IntEnum

Only if the --python-typing flag is set.
jochenparm pushed a commit to jochenparm/flatbuffers that referenced this pull request Oct 29, 2024
This allows enums to be type check with mypy.
They will still behave like ints ->
> IntEnum is the same as Enum,
> but its members are also integers and can be used anywhere
> that an integer can be used.
> If any integer operation is performed with an IntEnum member,
> the resulting value loses its enumeration status.
https://docs.python.org/3/library/enum.html#enum.IntEnum

Only if the --python-typing flag is set.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ codegen Involving generating code from schema python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants