Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support non-equijoin predicate for EliminateCrossJoin #4866 #4877

Closed
alamb opened this issue Jan 11, 2023 · 3 comments
Closed

Support non-equijoin predicate for EliminateCrossJoin #4866 #4877

alamb opened this issue Jan 11, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Jan 11, 2023

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Follow on to #4844

The fix for incorrect answers in #4869 was to skip the optimizaton if non equijoins were present.

This ticket tracks actually supporting removing crossjoins when filters are present

Currently datafusion will loss the join filter of inner join when run EliminateCrossJoin rule. Following are query and optimized logical plan:

explain verbose select t1.t1_id,t2.t2_id,t3.t3_id 
                 from t1 
                 inner join t2 on t1.t1_id > t2.t2_id 
                 cross join t3 
                 where t3.t3_int > t1.t1_int and t1.t1_int > t2.t2_int;

This is because EliminateCrossJoin only consider equijoin predicate.

The idea is to rewrite EliminateCrossJoin, and choose the right input of join based on both equijoin and non-equijoin predicate. After this pr, the logical plan will be:

      Projection: t1.t1_id, t2.t2_id, t3.t3_id
        Inner Join:  Filter: t3.t3_int > t1.t1_int
          Inner Join:  Filter: t1.t1_int > t2.t2_int AND t1.t1_id > t2.t2_id
            TableScan: t1 projection=[t1_id, t1_int]
            TableScan: t2 projection=[t2_id, t2_int]
          TableScan: t3 projection=[t3_id, t3_int]

The join filter should not be lost.

@Dandandan
Copy link
Contributor

I am looking into this together with #12985

@Dandandan
Copy link
Contributor

Dandandan commented Oct 18, 2024

Looks like the example as given already gives the correct plan based filter pushdown after #5770

@Dandandan
Copy link
Contributor

In many situations we create good plans for filter + joins. We should have some counter-examples where we create suboptimal plans.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants