Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shade missing values when rasterizing Points #1330

Open
philipc2 opened this issue May 14, 2024 · 2 comments
Open

Shade missing values when rasterizing Points #1330

philipc2 opened this issue May 14, 2024 · 2 comments

Comments

@philipc2
Copy link

I am one of the developers of the UXarray package and we use Datashader and Holoviews for visualizing unstructured grids. The majority of our data is from climate model outputs, meaning that the data we are visualizing is mapped to the surface of a sphere and is projected for 2D visualization.

We support both Polygon and Point rasters, and one issue with the Point rasterization is that regions with lower point densities (such as the north and south poles when projected to 2D), lead to there being missing values near the poles. This can be seen when comparing the Polygon and Point rasters (look at the top and bottom)

image

image

While this is acceptable at high resolutions (the above grid is about 84 million points), the issue becomes much more noticable at lower resolutions

image

I'm wondering whether it would be possible to shade the missing values using some form of interpolation, such as nearest neighbor.

@philipc2
Copy link
Author

Here is a link to a sample dataset with some points to play with

gdf = gp.read_parquet("filename.parquet")

def plot(x_range=[-135,-65], y_range=[21,59]):
    cvs = ds.Canvas(plot_width=300, plot_height=300, x_range=x_range, y_range=y_range)
    agg = cvs.points(points, geometry='geometry', agg=ds.mean('temperature_500hPa'))
    return ds.tf.shade(agg, name=str((x_range,y_range)))

ds.tf.Images(plot((-110, -90), (35, 45)),
             plot((-105, -95), (37.5, 42.5)),
             plot((-102.5, -97.5), (39.25, 41.25)),
             plot((-101, -99), (39, 41)))
image

The ideal behavior would be to have no NaN values and for them to be shaded using some method (like nearest neighbor)

@hoxbro
Copy link
Member

hoxbro commented May 14, 2024

An example of how this can be done is:

import geopandas as gpd
import datashader as ds
import numpy as np
from scipy.interpolate import griddata


gdf = gpd.read_parquet("point_gdf.parquet")


def plot(x_range=[-135, -65], y_range=[21, 59]):
    cvs = ds.Canvas(plot_width=300, plot_height=300, x_range=x_range, y_range=y_range)
    agg = cvs.points(gdf, geometry="geometry", agg=ds.mean("temperature_500hPa"))
    return ds.tf.shade(agg, name=str((x_range, y_range)))


def plot_inter(x_range=[-135, -65], y_range=[21, 59]):
    img = plot(x_range, y_range)
    array = img.data
    non_zero_coords = np.argwhere(array != 0)
    non_zero_values = array[array != 0]

    # Get the coordinates of the zero elements
    zero_coords = np.argwhere(array == 0)
    interpolated_values = griddata(non_zero_coords, non_zero_values, zero_coords, method="nearest")

    # Replace the zeros with the interpolated values
    for (x, y), value in zip(zero_coords, interpolated_values):
        array[x, y] = value

    img.data = array
    img.name = img.name + " [interpolated]"
    return img


ds.tf.Images(
    plot_inter((-110, -90), (35, 45)),
    plot_inter((-105, -95), (37.5, 42.5)),
    plot_inter((-102.5, -97.5), (39.25, 41.25)),
    plot_inter((-101, -99), (39, 41)),
)

image

EDIT: was to quick with the code example, have update the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants