-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PERF] Improve hash table probe side decisions for Swordfish #3327
[PERF] Improve hash table probe side decisions for Swordfish #3327
Conversation
d399c3d
to
acce197
Compare
acce197
to
4d6d6d6
Compare
CodSpeed Performance ReportMerging #3327 will degrade performances by 53.75%Comparing Summary
Benchmarks breakdown
|
4d6d6d6
to
9510233
Compare
let scan_tasks = match &physical_scan_info.scan_state { | ||
ScanState::Operator(scan_op) => scan_op | ||
.0 | ||
.to_scan_tasks(physical_scan_info.pushdowns.clone(), execution_config)?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should avoid execution_config completely in the logical planner. This would also mean that the logical plan needs to know what execution engine we are targeting which breaks the invariant of the logical plan.
Right now to_scan_tasks
performs generates the scan tasks and then also run the scan task passes. But the implementation of that looks identical for the 3 ScanOperator impls that we have.
My proposal is that we remove the scan task passes from the to_scan_tasks
for GlobScanOperator
, AnonymousScanOperator
and the PythonScanOperator
and then run it during translation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
I did this, with a few caveats:
- At translation time, we have already converted the concrete scan tasks into
Arc<Vec<Arc<dyn ScanTaskLike>>>
- I implemented
ScanTaskLike.split()
as a trait method, then modifiedsplit_by_row_groups
to use this trait method on our trait objects - Unfortunately I could not do the same for merging because merging requires some data source and partitioning information that, if pulled into
ScanTaskLike
would create circular dependencies. - Instead, since the only non-test implementer of
ScanTaskLike
isScanTask
, for merging I downcasted theArc<dyn ScanTaskLike>
s intoArc<ScanTask>
, then kept the old implementation of merging.
Please let me know if this works for you or if it's a little sus!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some feedback on this! Happy to pair up tomorrow to crank this out
a997839
to
72d6f34
Compare
72d6f34
to
152c8d4
Compare
This PR lifts statistics into optimized logical plans so that they're available for local execution plans. It then uses these newly available statistics to make better decisions on whether to build the probe table of a hash join on the left or right side.
Benchmark results
For TPC-H, this gives us some notable speedups with Q5, Q8, and Q19.
Crucially, with this change, our native runner is now faster (or within some small deviation) than our previous python runner for all 22 TPC-H queries.
For more detailed results, we have:
Q5
Before
After
Q8
Before
After
Q19
Before
After