Categorical order in hierarchical axis not respected #6452

flying-sheep · 2024-11-15T12:48:15Z

ALL software version info

(this library, plus any other relevant software, e.g. bokeh, python, notebook, OS, browser, etc should be added within the dropdown below.)

Software Version Info

Using Python 3.12.7 environment at /home/phil/.local/share/hatch/env/virtual/dask millions of cells/cMqPOBoB/dask millions of cells
Package            Version
------------------ -----------
asttokens          2.4.1
bleach             6.2.0
bokeh              3.6.1
certifi            2024.8.30
charset-normalizer 3.4.0
colorcet           3.1.0
comm               0.2.2
contourpy          1.3.1
cycler             0.12.1
debugpy            1.8.8
decorator          5.1.1
executing          2.1.0
fonttools          4.55.0
holoviews          1.20.0
idna               3.10
ipykernel          6.29.5
ipython            8.29.0
ipywidgets         8.1.5
jedi               0.19.2
jinja2             3.1.4
jupyter-client     8.6.3
jupyter-core       5.7.2
jupyterlab-widgets 3.0.13
kiwisolver         1.4.7
linkify-it-py      2.0.3
markdown           3.7
markdown-it-py     3.0.0
markupsafe         3.0.2
matplotlib         3.9.2
matplotlib-inline  0.1.7
mdit-py-plugins    0.4.2
mdurl              0.1.2
mizani             0.13.0
nest-asyncio       1.6.0
numpy              2.1.3
packaging          24.2
pandas             2.2.3
panel              1.5.4
param              2.1.1
parso              0.8.4
patsy              1.0.1
pexpect            4.9.0
pillow             11.0.0
platformdirs       4.3.6
plotnine           0.14.1
prompt-toolkit     3.0.48
psutil             6.1.0
ptyprocess         0.7.0
pure-eval          0.2.3
pygments           2.18.0
pyparsing          3.2.0
python-dateutil    2.9.0.post0
pytz               2024.2
pyviz-comms        3.0.3
pyyaml             6.0.2
pyzmq              26.2.0
requests           2.32.3
scipy              1.14.1
six                1.16.0
stack-data         0.6.3
statsmodels        0.14.4
tornado            6.4.1
tqdm               4.67.0
traitlets          5.14.3
typing-extensions  4.12.2
tzdata             2024.2
uc-micro-py        1.0.3
urllib3            2.2.3
wcwidth            0.2.13
webencodings       0.5.1
widgetsnbextension 4.0.13
xyzservices        2024.9.0

Description of expected behavior and the observed behavior

When using multiple kvars, the category order is ignored:

Complete, minimal, self-contained example code that reproduces the issue

import pandas as pd
import holoviews as hv
hv.extension('bokeh')

cells_dtype = pd.CategoricalDtype(pd.array(["~1M", "~10M", "~100M"], dtype="string"), ordered=True)

df = pd.DataFrame(dict(
    cells=cells_dtype.categories.astype(cells_dtype),
    time=pd.array([2.99, 18.5, 835.2]),
    function=pd.array(["read", "read", "read"]),
))

hv.Bars(df, ["function", "cells"], ["time"])

Stack traceback and/or browser JavaScript console output

>>> df["cells"]
0      ~1M
1     ~10M
2    ~100M
Name: cells, dtype: category
Categories (3, string): [~1M < ~10M < ~100M]

Screenshots or screencasts of the bug in action

I may be interested in making a pull request to address this

The text was updated successfully, but these errors were encountered:

hoxbro · 2024-11-15T19:11:42Z

Sounds reasonable. I think the correct place to implement this is around here:

holoviews/holoviews/element/util.py

Lines 147 to 164 in c227510

    
           for group in grouped: 
        
               vals = group.dimension_values(ydim, False) 
        
               if len(vals) == 1: 
        
                   orderings[vals[0]] = [vals[0]] 
        
               else: 
        
                   for i in range(len(vals)-1): 
        
                       p1, p2 = vals[i:i+2] 
        
                       orderings[p1] = [p2] 
        
               if sort: 
        
                   if vals.dtype.kind in ('i', 'f'): 
        
                       sort = (np.diff(vals)>=0).all() 
        
                   else: 
        
                       sort = np.array_equal(np.sort(vals), vals) 
        
           if sort or one_to_one(orderings, ycoords): 
        
               ycoords = np.sort(ycoords) 
        
           elif not is_cyclic(orderings): 
        
               coords = list(itertools.chain(*sort_topologically(orderings))) 
        
               ycoords = coords if len(coords) == len(ycoords) else np.sort(ycoords)

flying-sheep · 2024-11-18T08:42:02Z

You don’t use typing so I’m going to ask for a bit more info before I try my hand at this.

Does dimension_values return a pandas array/series/index complete with ExtensionDtypes or a plain numpy array where all pandas information is destroyed?

Don’t get me wrong: If it’s the latter, I understand, it’s better to be independent of pandas to support alternatives like polars, but that’d mean that the information about categories and their order has to be passed down somewhere.

philippjfr · 2024-11-18T14:32:21Z

You don’t use typing so I’m going to ask for a bit more info before I try my hand at this.

Really mostly an artifact of most of the code preceding the introduction of typing. Wish we'd change that, but that's obviously a huge lift.

Does dimension_values return a pandas array/series/index complete with ExtensionDtypes or a plain numpy array where all pandas information is destroyed?

We could use dataset.interface.dtype(dataset, dimension) to check if it's a category or extension dtype and then look up the category order on that.

hoxbro added the type: enhancement Minor feature or improvement to an existing feature label Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Categorical order in hierarchical axis not respected #6452

Categorical order in hierarchical axis not respected #6452

flying-sheep commented Nov 15, 2024 •

edited

Loading

hoxbro commented Nov 15, 2024

flying-sheep commented Nov 18, 2024

philippjfr commented Nov 18, 2024

Categorical order in hierarchical axis not respected #6452

Categorical order in hierarchical axis not respected #6452

Comments

flying-sheep commented Nov 15, 2024 • edited Loading

ALL software version info

Description of expected behavior and the observed behavior

Complete, minimal, self-contained example code that reproduces the issue

Stack traceback and/or browser JavaScript console output

Screenshots or screencasts of the bug in action

hoxbro commented Nov 15, 2024

flying-sheep commented Nov 18, 2024

philippjfr commented Nov 18, 2024

flying-sheep commented Nov 15, 2024 •

edited

Loading