Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plotly express scatterplots not displaying correctly in jupyter #4154

Open
OlovJ opened this issue Apr 7, 2023 · 16 comments
Open

plotly express scatterplots not displaying correctly in jupyter #4154

OlovJ opened this issue Apr 7, 2023 · 16 comments
Labels
bug something broken P3 backlog

Comments

@OlovJ
Copy link

OlovJ commented Apr 7, 2023

I have an issue with larger scatterplots not displaying correctly. I am using jupyter on a Mac M1 and both using it from vs code and from jupyter-lab have the same issue (or I might be doing something wrong)

When displaying 1000 points it works as I expect but as soon as I get over 1000 points the x-values is not correct any more

import pandas as pd
import plotly
import plotly.express as px
import plotly.graph_objects as go
import random
import datetime

df = pd.DataFrame(columns=['metric', 'value', 'time'])
for i in range(1001):
    df = pd.concat([df, pd.DataFrame({'metric': random.choice(['a', 'b', 'c', 'd', 'e']), 
                                      'value': random.random()*20, 
                                      'time': datetime.datetime.now() + datetime.timedelta(seconds=random.randint(1, 300))}, index=[0])],
                                      ignore_index=True)
print(df.info())
print("Plotly version", plotly.__version__)
print("Pandas version", pd.__version__)
fig1 = px.scatter(df.head(1000), x="time", y="value", title="Durations", color='metric')
fig2 = px.scatter(df, x="time", y="value", title="Durations", color='metric')
fig1.show()
fig2.show()

Will have the output:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1001 entries, 0 to 1000
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   metric  1001 non-null   object        
 1   value   1001 non-null   float64       
 2   time    1001 non-null   datetime64[ns]
dtypes: datetime64[ns](1), float64(1), object(1)
memory usage: 23.6+ KB
None
Plotly version 5.14.1
Pandas version 2.0.0

and the plots
1000 points
image

and 1001 points
image

other information of value is that the hoover data is triggered in the correct place so when i hover empty space i get the correct popup
image

if i do it using go figures it works as expected

fig = go.Figure()
for metric, group in df.groupby("metric"):
    fig.add_trace(go.Scatter(x=group['time'], y=group['value'], mode='markers', name=metric))
fig.show()

image

@OlovJ
Copy link
Author

OlovJ commented Apr 7, 2023

Tried it on a windows computer and it works there same version of plotly and pandas and the same version of python 3.11.3

@OlovJ
Copy link
Author

OlovJ commented Apr 7, 2023

Also the same plot but with line works perfectly so it only seems to happen with scatters over 1000 points on Arm for me.

@hammas9
Copy link

hammas9 commented Apr 17, 2023

I am experiencing the same issue. Works great on my windows laptop but I see similar vertical lines on my Mac. If you hover your mouse around in spaces between the vertical lines, the annotation of the points that were supposed to be there (but are invisible for some reason) pop up.

@AaronStiff
Copy link
Contributor

@OlovJ out of curiosity does the same issue occur with pandas==1.5.3?

@StefanKaiser-TomTom
Copy link

I can confirm the issue with pandas==2.1.1 and plotly==5.17.0

@j-at-ch
Copy link

j-at-ch commented Feb 23, 2024

I'm hitting the same issue using both versions 5.9.0 and 5.19.0 on MacOS.

@j-at-ch
Copy link

j-at-ch commented Feb 23, 2024

Just dug into the code and I've found the cause (thanks @OlovJ for investigating the 1000 point threshold - it helped to find the culprit).

The issue here seems to be the renderer - see this line in the source code.

Switching the renderer to "svg" solves this issue for me.

px.scatter(df, x="time", y="value", render_mode="svg")

@Coding-with-Adam
Copy link
Contributor

@j-at-ch THank you for investigating this issue. But I'm actually not able to reproduce this error from the initial code posted.

Can you please share the exact code you used to reproduce the error?

@j-at-ch
Copy link

j-at-ch commented Mar 4, 2024

@Coding-with-Adam I've posted a MWE .ipynb sheet on Google Colab here. Can you confirm that you see the rendering issue there?

The issue seems to be how the system handles render_mode="auto" and so isn't possible to reproduce from the code alone. I think the unexpected behaviour occurs when:

  • using render_mode="webgl" and
  • more than a certain number of points in a scatter plot

II haven't found any documentation of this behaviour - and on MacOS using PyCharm to run jupyter notebooks the default behaviour seems to be "webgl" and so scatters with more than a certain number of points always show this undesirable binning effect.

@Coding-with-Adam
Copy link
Contributor

Thanks @j-at-ch .
The Google Colab code you shared looks good on my Firefox browser. No issue there either. I'm not sure why I'm not able to reproduce this error.

@j-at-ch
Copy link

j-at-ch commented Mar 9, 2024

Thanks for the follow-up @Coding-with-Adam!

I've updated a few things in the Google Colab notebook so that there are plots with:

  • an explicit render_mode='auto'
  • an explicit render_mode='webgl'
  • an explicit render_mode='svg'.

Would you be able to check the updated plots are report if there are any differences please?

At least then we'll be able to tell whether the behaviour some of us are observing is due to renderer options (and how our individual systems handle the auto option).

@Coding-with-Adam
Copy link
Contributor

hi @j-at-ch
Thanks for looking further into this. All figures look exactly the same to me.

image

@j-at-ch
Copy link

j-at-ch commented Mar 13, 2024

Here's what it looks like to me @Coding-with-Adam:

Figure 1: render_mode='auto'

Figure 2: render_mode='webgl'

Figure 3: render_mode='svg'

Maybe this is a hardware-related issue for 'webgl'? I'm using MacOS with an M1 Pro chip.

@Coding-with-Adam
Copy link
Contributor

Thanks for sharing the images @j-at-ch .

@alexcjohnson
Any idea what might be causing this bug in webgl? Or, any idea how we can test it to dig deeper?

@alexcjohnson
Copy link
Collaborator

We've seen hardware-dependent precision issues in WebGL a few times... and it makes sense that it would show up mostly on date axes, where the zero is all the way back at 1970 so the difference of a few minutes in recent years is in a fairly deep digit.

There's probably some ugly way around it for now (I seem to recall a command-line switch to use higher precision?) but it'll keep popping up unless we do something deeper like rescaling all the data around a zero that's on or near the actual axis range before we send it to WebGL. That's a pretty big project, but it would ensure WebGL gets the same precision as SVG.

@j-at-ch
Copy link

j-at-ch commented Mar 20, 2024

Thanks for the diagnosis @alexcjohnson - helpful context! Manually setting render_mode='svg' should work for my use-cases for now. Just need to remember to set it!

@gvwilson gvwilson self-assigned this Jul 11, 2024
@gvwilson gvwilson removed their assignment Aug 2, 2024
@gvwilson gvwilson added the P3 backlog label Aug 12, 2024
@gvwilson gvwilson changed the title Issue with plotly express scatterplots plotly express scatterplots not displaying correctly in jupyter Aug 12, 2024
@gvwilson gvwilson added the bug something broken label Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something broken P3 backlog
Projects
None yet
Development

No branches or pull requests

8 participants