You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge?
Introduction
This ticket is my weekly-ish summary of interesting things happening in DataFusion. Note this is not a complete list (it is what I remember / can find). Please leave comments on this ticket about things that I may have missed or you think should get wider attention by the community. Follow on to #13970
We are victims of our own success. At the time of writing there are over 50 PRs in various states of review check out the list. More help reviewing the better 🙏
Looking to get more involved? Please help review code! 🎣
DataFusion has a long history of community members contributing in all aspects of the project. Reviewing PRs is an especially great way to get introduced to the project, help the community and grow your own knowledge -- researching and understanding the code enough to review PRs also often inspires additional ideas for improvements.
We have docs about reviews. TLDR is: look for test coverage, if the change is understandable and well documented, and if the code can be improved. When you think the PR looks good to merge, try @ mentioning one of the committers.
Help wanted
I would love to see the community offer additional help testing, triaging bugs helping to make DataFusion a more stable foundation for building systems
Please feel leave your own comments on this ticket if you are looking for help
Is your feature request related to a problem or challenge?
Introduction
This ticket is my weekly-ish summary of interesting things happening in DataFusion. Note this is not a complete list (it is what I remember / can find). Please leave comments on this ticket about things that I may have missed or you think should get wider attention by the community. Follow on to #13970
Reminder, find new content (and please post some!) to Concepts, Readings, Events page
Community Highlights
DISCUSSION: January 2025 DataFusion Meetup in Amsterdam / CIDR 2025 #12988
Releases!
Performance
DataFusion's core value proposition is great performance without having to re-implement it yourself
44.0.0
#13983 (45 is even better)simplify
for thestarts_with
function to convert it into a LIKE expression. #14119array_agg
faster 🚀 perf(array-agg): add fast path for array agg formerge_batch
#14299median
by implementing specialGroupsAccumulator
#13681 🚀median
with no grouping by 2X #14399struct
andnamed_struct
usinginvoke_with_args
#14276Quality
Testing
--complete
: fix: run sqllogictest with complete #14254Bug Fixes
DataFusion is in the "we are finding all the corner case bugs now" phase of its life and people are now bashing them down
count
toUInt64
inarray_repeat
function to ensure consistent integer type handling #14236NullHandling
enum toSignature
#14289null
input inmap_keys/values
#14401Docs
Build time
Cleanups 🧹
Now that we have a large useful codebase it is also important to keep it neat and tidy so we spend a non trivial time there too.
EnforceDistribution
intodatafusion-physical-optimizer
crate #14190 move projection pushdown optimization logic to ExecutionPlan trait #14235, etcarray_has
: Make scalar and array handling for array_has consistent #13683Features
We can have nice things! (Error messages)
@eliaperantoni added support for source code locations for error Add related source code locations to errors #13664 and has organized a project to add more support [EPIC] Attach
Diagnostic
to more errors #14429We started publishing the
datafusion-sqllogictest
crate to help testing inicerberg-rust
: Publishing `datafusion_sqllogictest` as a crate. #14229 (thanks to @liurenjie1024 for the great idea)@jayzhan211 unified advanced UDF argument handling: Introduce
return_type_from_args
for ScalarFunction. #14094@gatesn added support for
SUM
statistics: AddColumnStatistics::Sum
#14074@erenavsarogullari added Support arrays_overlap function (alias of
array_has_any
) #14217@timsaucer made FFI hopefully more usable with asycn code: FFI support for versions and alternate tokio runtimes #13937
@davisp made insert work in FFI: Add
TableProvider::insert_into
into FFI Bindings #14391Coming soon: Extension Types
Misc
<=>
in Support spaceship operator (<=>
) support (alias forIS NOT DISTINCT FROM
#14187Looking to get more involved? Please help review code! 🎣
DataFusion has a long history of community members contributing in all aspects of the project. Reviewing PRs is an especially great way to get introduced to the project, help the community and grow your own knowledge -- researching and understanding the code enough to review PRs also often inspires additional ideas for improvements.
We have docs about reviews. TLDR is: look for test coverage, if the change is understandable and well documented, and if the code can be improved. When you think the PR looks good to merge, try
@
mentioning one of the committers.Help wanted
Please feel leave your own comments on this ticket if you are looking for help
Community
Upcoming meetups:
The text was updated successfully, but these errors were encountered: