-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CALCITE-6451] Improve Nullability Derivation for Intersect and Minus #3845
base: main
Are you sure you want to change the base?
[CALCITE-6451] Improve Nullability Derivation for Intersect and Minus #3845
Conversation
@ParameterizedTest | ||
@ValueSource(booleans = {false, true}) | ||
void testUnionTypeDerivation(boolean all) { | ||
final RelBuilder builder = RelBuilder.create(config().build()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is the best place for these tests. I'm open to suggestions about where they should go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @vbarua , I find some test in https://github.com/apache/calcite/blob/main/core/src/test/java/org/apache/calcite/test/JdbcTest.java#L4525-L4526 . Looks it is a good please for this tests.
3259727
to
c28c090
Compare
This interacts interestingly/poorly with the IntersectToDistinctRule, which rewrites Intersects to Unions using something like:
which becomes
What's interesting about this is that while it returns the same results, it's actually somewhat lossy with regards to the nullability information because we can't infer as tight nullability bounds for the second form as we can the first. That is to say, if One way I could see to work around this would be to improve the IntersectToDistinctRule by including filtering information when possible. That is, when a column is not nullable in all Intersect branches, we add an
However that might also require additional work because I don't believe that Calcite can use the presence of an IS NOT NULL filter to change the output nullability. The reason this is a problem now is that when the IntersectToDisctinctRule is applied that types of the RelNode going in and RelNode coming out no longer match. |
For Minus, the output column nullability is that of the primary input For Intersect, an output column is nullable if and only if it is nullable in all of the inputs
c28c090
to
6238792
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will edit the PR title to match the JIRA ticket title.
You will have to do the same for the commit message.
I think you will need to fix the IntersectToDistinct rule before we can merge this PR. |
Updates the IntersectToDistinctRule to: * Filter inputs before aggregation to exclude nulls when possible * Force the correct nullabilities for output columns
I've pushed some changes that address the issue with the IntersectToDistinctRule. Unfortunately, it looks there's a different issue as well. The test in: calcite/core/src/test/resources/sql/misc.iq Lines 1654 to 1665 in f44ed0a
is failing now because the output type of the relational tree no longer matches the type of the validated SQL tree. I suspect I need to update the SQL validation logic as well to address that. I will continue to look at this when I have a chance. |
I will convert this to a draft, since it's not yet ready |
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 90 days if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions. |
Issue: CALCITE-6451