Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when importing dask_geopandas.expr on Python 3.10 when Dask query planning is disabled #311

Open
maximlt opened this issue Sep 10, 2024 · 7 comments

Comments

@maximlt
Copy link

maximlt commented Sep 10, 2024

The test runs of datashader have been failing recently on Python 3.10 and the issue can be narrowed down to this piece of code:

import dask
dask.config.set({'dataframe.query-planning': False})

import dask_geopandas.expr

Traceback:

Traceback (most recent call last):
  File "/Users/mliquet/dev/datashader/.mltmess/debuglayout.py", line 3, in <module>
    import dask_geopandas.expr
  File "/Users/mliquet/dev/datashader/.pixi/envs/test-310/lib/python3.10/site-packages/dask_geopandas/expr.py", line 587, in <module>
    class GeoSeries(_Frame, dd.Series):
TypeError: multiple bases have instance lay-out conflict

I just wanted to report it in case someone else stumbles upon that weird corner case, we're going to slightly modify datashader to deal with that case directly. Feel free to close the issue!

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Sep 12, 2024

Thanks for the report.

I'm... a bit confused about how that error is being hit. I'm not able to reproduce it with just this, and the traceback doesn't make a ton of sense:

import dask
dask.config.set({'dataframe.query-planning': False})

import dask_geopandas.core
import dask_geopandas.expr

relatedly, is datashader compatible with dask-expr? I'm guessing we'll require that soon-ish, given the limited maintenance bandwidth here.

@maximlt
Copy link
Author

maximlt commented Sep 13, 2024

Have you tried with Python 3.10? I'm able to reproduce this with this set of dependencies.

conda list

# Name                    Version                   Build  Channel
aws-c-auth                0.7.29               hd3c7522_1    conda-forge
aws-c-cal                 0.7.4                h41dd001_1    conda-forge
aws-c-common              0.9.28               hd74edd7_0    conda-forge
aws-c-compression         0.2.19               h41dd001_1    conda-forge
aws-c-event-stream        0.4.3                hb2a355e_1    conda-forge
aws-c-http                0.8.8                hf5a2c8c_2    conda-forge
aws-c-io                  0.14.18              hc3cb426_9    conda-forge
aws-c-mqtt                0.10.4              hb9beb3e_19    conda-forge
aws-c-s3                  0.6.5                h439c227_2    conda-forge
aws-c-sdkutils            0.1.19               h41dd001_3    conda-forge
aws-checksums             0.1.18              h41dd001_11    conda-forge
aws-crt-cpp               0.28.2               h4756f83_4    conda-forge
aws-sdk-cpp               1.11.379             h67f4a54_9    conda-forge
azure-core-cpp            1.13.0               hd01fc5c_0    conda-forge
azure-identity-cpp        1.8.0                h13ea094_2    conda-forge
azure-storage-blobs-cpp   12.12.0              hfde595f_0    conda-forge
azure-storage-common-cpp  12.7.0               hcf3b6fd_1    conda-forge
azure-storage-files-datalake-cpp 12.11.0              h082e32e_1    conda-forge
bzip2                     1.0.8                h99b78c6_7    conda-forge
c-ares                    1.33.1               hd74edd7_0    conda-forge
ca-certificates           2024.8.30            hf0a4a13_0    conda-forge
click                     8.1.7           unix_pyh707e725_0    conda-forge
cloudpickle               3.0.0              pyhd8ed1ab_0    conda-forge
dask-core                 2024.8.2           pyhd8ed1ab_0    conda-forge
dask-expr                 1.1.13             pyhd8ed1ab_0    conda-forge
dask-geopandas            0.4.1              pyhd8ed1ab_0    conda-forge
fsspec                    2024.9.0           pyhff2d567_0    conda-forge
geopandas-base            1.0.1              pyha770c72_0    conda-forge
geos                      3.12.2               h00cdb27_1    conda-forge
gflags                    2.2.2             hc88da5d_1004    conda-forge
glog                      0.7.1                heb240a5_0    conda-forge
icu                       75.1                 hfee45f7_0    conda-forge
importlib-metadata        8.5.0              pyha770c72_0    conda-forge
importlib_metadata        8.5.0                hd8ed1ab_0    conda-forge
krb5                      1.21.3               h237132a_0    conda-forge
libabseil                 20240116.2      cxx17_h00cdb27_1    conda-forge
libarrow                  17.0.0          h20538ec_13_cpu    conda-forge
libarrow-acero            17.0.0          hf9b8971_13_cpu    conda-forge
libarrow-dataset          17.0.0          hf9b8971_13_cpu    conda-forge
libarrow-substrait        17.0.0          hbf8b706_13_cpu    conda-forge
libblas                   3.9.0           23_osxarm64_openblas    conda-forge
libbrotlicommon           1.1.0                hd74edd7_2    conda-forge
libbrotlidec              1.1.0                hd74edd7_2    conda-forge
libbrotlienc              1.1.0                hd74edd7_2    conda-forge
libcblas                  3.9.0           23_osxarm64_openblas    conda-forge
libcrc32c                 1.1.2                hbdafb3b_0    conda-forge
libcurl                   8.10.0               h13a7ad3_0    conda-forge
libcxx                    18.1.8               h3ed4263_7    conda-forge
libedit                   3.1.20191231         hc8eb9b7_2    conda-forge
libev                     4.33                 h93a5062_2    conda-forge
libevent                  2.1.12               h2757513_1    conda-forge
libffi                    3.4.2                h3422bc3_5    conda-forge
libgfortran               5.0.0           13_2_0_hd922786_3    conda-forge
libgfortran5              13.2.0               hf226fd6_3    conda-forge
libgoogle-cloud           2.28.0               hfe08963_0    conda-forge
libgoogle-cloud-storage   2.28.0               h1466eeb_0    conda-forge
libgrpc                   1.62.2               h9c18a4f_0    conda-forge
libiconv                  1.17                 h0d3ecfb_2    conda-forge
liblapack                 3.9.0           23_osxarm64_openblas    conda-forge
libnghttp2                1.58.0               ha4dd798_1    conda-forge
libopenblas               0.3.27          openmp_h517c56d_1    conda-forge
libparquet                17.0.0          hf0ba9ef_13_cpu    conda-forge
libprotobuf               4.25.3               hbfab5d5_0    conda-forge
libre2-11                 2023.09.01           h7b2c953_2    conda-forge
libsqlite                 3.46.1               hc14010f_0    conda-forge
libssh2                   1.11.0               h7a5bd25_0    conda-forge
libthrift                 0.20.0               h64651cc_1    conda-forge
libutf8proc               2.8.0                h1a8c8d9_0    conda-forge
libxml2                   2.12.7               h01dff8b_4    conda-forge
libzlib                   1.3.1                hfb2fe0b_1    conda-forge
llvm-openmp               18.1.8               hde57baf_1    conda-forge
locket                    1.0.0              pyhd8ed1ab_0    conda-forge
lz4-c                     1.9.4                hb7217d7_0    conda-forge
ncurses                   6.5                  h7bae524_1    conda-forge
numpy                     2.1.1           py310hcaf17df_0    conda-forge
openssl                   3.3.2                h8359307_0    conda-forge
orc                       2.0.2                h75dedd0_0    conda-forge
packaging                 24.1               pyhd8ed1ab_0    conda-forge
pandas                    2.2.2           py310h2216879_1    conda-forge
partd                     1.4.2              pyhd8ed1ab_0    conda-forge
pip                       24.2               pyh8b19718_1    conda-forge
pyarrow                   17.0.0          py310h24597f5_1    conda-forge
pyarrow-core              17.0.0          py310hf3d4daf_1_cpu    conda-forge
python                    3.10.14         h2469fbe_0_cpython    conda-forge
python-dateutil           2.9.0              pyhd8ed1ab_0    conda-forge
python-tzdata             2024.1             pyhd8ed1ab_0    conda-forge
python_abi                3.10                    5_cp310    conda-forge
pytz                      2024.2             pyhd8ed1ab_0    conda-forge
pyyaml                    6.0.2           py310h493c2e1_1    conda-forge
re2                       2023.09.01           h4cba328_2    conda-forge
readline                  8.2                  h92ec313_1    conda-forge
setuptools                73.0.1             pyhd8ed1ab_0    conda-forge
shapely                   2.0.6           py310hf43f220_1    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
snappy                    1.2.1                hd02b534_0    conda-forge
tk                        8.6.13               h5083fa2_1    conda-forge
toolz                     0.12.1             pyhd8ed1ab_0    conda-forge
tzdata                    2024a                h8827d51_1    conda-forge
wheel                     0.44.0             pyhd8ed1ab_0    conda-forge
xz                        5.2.6                h57fd34a_0    conda-forge
yaml                      0.2.5                h3422bc3_2    conda-forge
zipp                      3.20.1             pyhd8ed1ab_0    conda-forge
zstd                      1.5.6                hb46c0d2_0    conda-forge


relatedly, is datashader compatible with dask-expr?

Yes, I believe compatibility with dask-expr has been achieved with datashader 0.16.1 and 0.16.2 (https://datashader.org/releases.html).

@jorisvandenbossche
Copy link
Member

I can reproduce it with a similar environment as you have above (mamba create -n test-dask-geopandas-bug python=3.10 dask=2024.8.2 dask-geopandas pyarrow ipython). So maybe the Python version is indeed relevant (my dev environment with 3.11 does not reproduce it).

@jorisvandenbossche
Copy link
Member

Searching online for the error, some comments point to this issue appearing when adding __slots__ to the shared base class (see eg comment on answer in https://stackoverflow.com/questions/48136025/typeerror-multiple-bases-have-instance-lay-out-conflict).

And indeed, it seems that slots were introduced (at least non-empty slots) in DaskMethodsMixin (which both of the classes we inherit from have as their base class as well) in dask/dask#11348. So that might be related.

Downgrading dask to 2024.7, then the error goes away (although when upgrading dask in my dev env, I still cannot reproduce the error there, so it's not just the dask version that is in play .. Upgrading Python to 3.11 in the test env, then the error also goes away, so it's specific to Python 3.10?)

@jorisvandenbossche
Copy link
Member

Given that this only seems to happen for Python 3.10, when explicitly importing dask_geopandas.expr when query planning is disabled (a normal import of dask_geopandas will not import that in this case), and dask-cudf has the same code organization (and that I am not sure how to fix this), it's maybe not too important?

@maximlt what was the use case in datashader to explicitly import dask_geopandas.expr?

@maximlt
Copy link
Author

maximlt commented Sep 25, 2024

Thanks for digging into that :)

it's maybe not too important?

That's also my conclusion.

It's part of an isinstance check https://github.com/holoviz/datashader/blob/e9e64c5c2b845fe2d815c221437b5399f5e2e2e1/datashader/core.py#L1292

@jorisvandenbossche
Copy link
Member

It's part of an isinstance check https://github.com/holoviz/datashader/blob/e9e64c5c2b845fe2d815c221437b5399f5e2e2e1/datashader/core.py#L1292

Would it be sufficient for datashader to just do dask_geopandas import GeoDataFrame, instead of importing from both dask_geopandas.core and dask_geopandas.expr. That will then automatically give you one of both depending on the dask setting (of course, that might then not work if the user changes that setting after this code is run?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants