Implementations: pyspark and ray #90

martindurant · 2025-01-16T15:30:59Z

Remaining issues:

pyspark picking up wrong python on windows (perhaps need environment variables set)
time dtype inference requires new release of awkward, affects pyspark only
ray partitioning isn't deterministic?

upstream API changes...

into implementations

martindurant · 2025-02-01T18:24:48Z

(tests will pass following new release of awkward scikit-hep/awkward#3392 )

martindurant · 2025-02-01T18:26:31Z

Needs documentation updates. Refactoring to have the same code in all the lazy backends will be done in a follow-up (note how similar the spark and ray implementations are, that we wish to also do polars-lazy, and that the dask-dataframe version now appears less complete by comparison.

Also, we should add at least one numba test; should we have the same test suite for all backends?

Martin Durant and others added 11 commits November 19, 2024 15:21

cudf updates

a680533

upstream API changes...

Merge branch 'implementations' of https://github.com/martindurant/akimbo

a3b4df3

into implementations

Add daft and spark

dfd3eda

remove bad merge

e20a9f7

Complete spark tests

31351c0

use conda for sprk

5c3a4ef

simplify

01db23c

some dep limits

67027cc

ensure shell

5ed570b

CI deps

a16e317

reorder pytest arguments

0424178

martindurant changed the title ~~Implementations: pyspark and daft~~ Implementations: pyspark and ray Jan 30, 2025

martindurant added 2 commits January 31, 2025 18:51

Mostly Ray

00375e4

Finish ray

5378cb8

martindurant marked this pull request as ready for review February 1, 2025 18:24

martindurant added 4 commits February 3, 2025 12:19

Add numba tests as examples

7f75ffd

Add minimal API docs

3c485fc

overriudes

27ac62f

Skip dt tests on WIN where locale may be missing

22794bc

martindurant merged commit 6b83113 into intake:main Feb 6, 2025
2 of 9 checks passed

martindurant deleted the implementations branch February 6, 2025 15:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementations: pyspark and ray #90

Implementations: pyspark and ray #90

martindurant commented Jan 16, 2025 •

edited

Loading

martindurant commented Feb 1, 2025

martindurant commented Feb 1, 2025

Implementations: pyspark and ray #90

Implementations: pyspark and ray #90

Conversation

martindurant commented Jan 16, 2025 • edited Loading

martindurant commented Feb 1, 2025

martindurant commented Feb 1, 2025

martindurant commented Jan 16, 2025 •

edited

Loading