Skip to content

Commit

Permalink
[CALCITE-6451] refine null handling in IntersectToDistinctRule
Browse files Browse the repository at this point in the history
Updates the IntersectToDistinctRule to:
* Filter inputs before aggregation to exclude nulls when possible
* Force the correct nullabilities for output columns
  • Loading branch information
vbarua committed Jul 20, 2024
1 parent 6238792 commit d2f41ec
Showing 1 changed file with 26 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,10 @@
import org.apache.calcite.rel.RelNode;
import org.apache.calcite.rel.core.Intersect;
import org.apache.calcite.rel.logical.LogicalIntersect;
import org.apache.calcite.rel.type.RelDataTypeField;
import org.apache.calcite.rex.RexBuilder;
import org.apache.calcite.rex.RexNode;
import org.apache.calcite.sql.fun.SqlStdOperatorTable;
import org.apache.calcite.tools.RelBuilder;
import org.apache.calcite.tools.RelBuilderFactory;
import org.apache.calcite.util.ImmutableBitSet;
Expand All @@ -31,6 +34,8 @@
import org.immutables.value.Value;

import java.math.BigDecimal;
import java.util.ArrayList;
import java.util.List;

/**
* Planner rule that translates a distinct
Expand Down Expand Up @@ -97,9 +102,27 @@ public IntersectToDistinctRule(Class<? extends Intersect> intersectClass,
final RexBuilder rexBuilder = cluster.getRexBuilder();
final RelBuilder relBuilder = call.builder();

List<RelDataTypeField> outputFields = intersect.getRowType().getFieldList();

// 1st level GB: create a GB (col0, col1, count() as c) for each branch
for (RelNode input : intersect.getInputs()) {
relBuilder.push(input);

// if any of the input fields is non-nullable, the corresponding output field is non-nullable
// this is captured in the type derivation in intersect.getRowType()
// if we know that nulls cannot be present in the output, then we can filter them from the inputs before aggregating

Check failure on line 113 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 17)

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 122).

Check failure on line 113 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 11), Pacific/Chatham Timezone

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 122).

Check failure on line 113 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 8), oldest Guava, America/New_York Timezone

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 122).

Check failure on line 113 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 8), latest Guava, America/New_York Timezone

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 122).

Check failure on line 113 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 21)

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 122).

Check failure on line 113 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 22)

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 122).

Check failure on line 113 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 11), Avatica main

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 122).

Check failure on line 113 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / macOS (JDK 21)

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 122).
ArrayList<RexNode> nullFilters = new ArrayList<>();
List<RelDataTypeField> inputFields = input.getRowType().getFieldList();
for (int fieldIndex = 0; fieldIndex < outputFields.size(); fieldIndex++) {
RelDataTypeField inputField = inputFields.get(fieldIndex);
if (!outputFields.get(fieldIndex).getType().isNullable() && inputField.getType().isNullable()) {

Check failure on line 118 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 17)

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 104).

Check failure on line 118 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 11), Pacific/Chatham Timezone

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 104).

Check failure on line 118 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 8), oldest Guava, America/New_York Timezone

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 104).

Check failure on line 118 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 8), latest Guava, America/New_York Timezone

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 104).

Check failure on line 118 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 21)

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 104).

Check failure on line 118 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 22)

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 104).

Check failure on line 118 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 11), Avatica main

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 104).

Check failure on line 118 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / macOS (JDK 21)

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 104).
nullFilters.add(rexBuilder.makeCall(SqlStdOperatorTable.IS_NOT_NULL, rexBuilder.makeInputRef(input, fieldIndex)));

Check failure on line 119 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 17)

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 124).

Check failure on line 119 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 11), Pacific/Chatham Timezone

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 124).

Check failure on line 119 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 8), oldest Guava, America/New_York Timezone

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 124).

Check failure on line 119 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 8), latest Guava, America/New_York Timezone

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 124).

Check failure on line 119 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 21)

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 124).

Check failure on line 119 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 22)

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 124).

Check failure on line 119 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / Linux (JDK 11), Avatica main

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 124).

Check failure on line 119 in core/src/main/java/org/apache/calcite/rel/rules/IntersectToDistinctRule.java

View workflow job for this annotation

GitHub Actions / macOS (JDK 21)

[Task :core:checkstyleMain] [LineLength] Line is longer than 100 characters (found 124).
}
}

if (!nullFilters.isEmpty()) {
relBuilder.filter(nullFilters);
}
relBuilder.aggregate(relBuilder.groupKey(relBuilder.fields()),
relBuilder.countStar(null));
}
Expand All @@ -126,6 +149,9 @@ public IntersectToDistinctRule(Class<? extends Intersect> intersectClass,
// Project all but the last field
relBuilder.project(Util.skipLast(relBuilder.fields()));

// ensure the nullabilities of columns in the new relation match those of the input relation
relBuilder.convert(intersect.getRowType(), false);

// the schema for intersect distinct is like this
// R3 on all attributes + count(c) as cnt
// finally add a project to project out the last column
Expand Down

0 comments on commit d2f41ec

Please sign in to comment.