Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate to awkward-dataframe #51

Closed
martindurant opened this issue Apr 19, 2024 · 1 comment
Closed

migrate to awkward-dataframe #51

martindurant opened this issue Apr 19, 2024 · 1 comment

Comments

@martindurant
Copy link
Member

This captures my conversation with @jpivarski

Since we have an integration with polars (#41 ) and coming integration with cuDF (#50 ) and maybe even daft (for distribution on Ray, where at least map-partition or map-row operations would be simple), the scope of this library has changed and should be renamed.

We might require some logic at import to determine which accessors to register/attach depending on what is installed (or require explicit imports of relevant sub-modules).

This also suggests that the effort put into the "awkward" dtype and extension array for pandas was not necessary, and the only thing that's important is the accessor; after all, any pandas column has a to-arrow method which is implemented for all builtin types (including actual arrow columns) and probably any extension types we might care about. This change would also make the code for each integration very similar.

All this would amount to awkward being the identical nested/var-length API across several dataframe types, and bring fast numba vectorised functionality too. It also brings the possibility of attaching interesting "behaviours" (like IP addresses, discussed before, or vectors, or images...) to these dataframe libraries.

Finally, it would be nice to attach the accessor to dataframes rather than just series, corresponding the current pandas to/from columns logic, since a table is exactly the same thing as a record-array.

@martindurant
Copy link
Member Author

I should also add, then when we first considered making this library broader, the name "aktuate" was mentioned, and being more attention-grabbing. Critically, it includes the starting characters "ak" used for all accessors. There are some companies with this name, but no software that I could find. Other possible names were also suggested at the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant