-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: introduce interval_year
and interval_day
types
#48
feat: introduce interval_year
and interval_day
types
#48
Conversation
83fd62d
to
d6ac63f
Compare
test/Dialect/Substrait/types.mlir
Outdated
%0 = named_table @t1 as ["a", "b"] : tuple<!substrait.interval_year, !substrait.interval_day> | ||
yield %0 : tuple<!substrait.interval_year, !substrait.interval_day> | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: final line break.
dd58ba1
to
296155a
Compare
82ea7b2
to
0543ca8
Compare
Here as well: need to rebase again. |
I think this one is next in line and needs rebasing to proceed. |
e9d2e3d
to
131caf5
Compare
Rebased and adjusted to #59. PR description also adjusted. |
I am getting this error in CI:
which is causing one test to fail:
I guess this is an internet connection error? Any ideas on how to fix? |
6751b11
to
44fd3d8
Compare
I ran into this this morning. This must be due to the update of DuckDB from yesterday. #79 should fix it. Rebase onto |
44fd3d8
to
8a6f0d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at this PR again and found a few things we should think about a bit more.
lib/Target/SubstraitPB/Export.cpp
Outdated
auto intervalYear = | ||
mlir::cast<IntervalYearMonthAttr>(value).getYearsValue(); | ||
auto intervalMonth = | ||
mlir::cast<IntervalYearMonthAttr>(value).getMonthsValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: use auto
only if the type appears in the RHS of a cast or similar, see LLVM Coding Standards. Also, my change factors out the cast, which makes it more readable and shorter. (Be sure to check if int32_t
is the type returned by getXValue()
.)
auto intervalYear = | |
mlir::cast<IntervalYearMonthAttr>(value).getYearsValue(); | |
auto intervalMonth = | |
mlir::cast<IntervalYearMonthAttr>(value).getMonthsValue(); | |
auto intervalYearMonth = mlir::cast<IntervalYearMonthAttr>(value); | |
int32_t intervalYear = intervalYearMonth.getYearsValue(); | |
int32_t intervalMonth = intervalYearMonth.getMonthsValue(); |
lib/Target/SubstraitPB/Export.cpp
Outdated
auto intervalDay = mlir::cast<IntervalDaySecondAttr>(value).getDaysValue(); | ||
auto intervalSecond = | ||
mlir::cast<IntervalDaySecondAttr>(value).getSecondsValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
def Substrait_IntervalYearMonthAttr | ||
: Substrait_StaticallyTypedAttr<"IntervalYearMonth", "interval_year_month", | ||
"IntervalYearMonthType"> { | ||
let summary = "Substrait interval year to month type"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to document a decision here. The spec says:
Usually stored as separate integers for years and months, but only the total number of months is significant, i.e.
1y 0m
is considered equal to0y 12m
or1001y -12000m
.
We could deduce that we only want to store the number of months, then. However, I think that our design rationale suggest we should have two different numbers:
Every valid Substrait plan MUST be representable in the dialect.
I'd argue that two plans that differ by the representation of an interval_year_month
are different plans (that always produce the same result), so we have to be able to express both of them with our dialect. I think that decision should be documented here.
We should, however, eventually implement a canonicalization pattern that converts these attributes to the representation with month in [0,12)
.
let description = [{ | ||
This type represents a substrait interval year to month attribute type. | ||
}]; | ||
let parameters = (ins "int32_t":$years_value, "int32_t":$months_value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec only allows "a range of [-10,000..10,000] years." If we don't add a constraint now, we should add a TODO
to be sure we add it later.
@@ -67,6 +67,28 @@ def Substrait_DateAttr | |||
let assemblyFormat = [{ `<` $value `>` }]; | |||
} | |||
|
|||
def Substrait_IntervalDaySecondAttr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same two comments here as for the other interval type.
8a6f0d8
to
93b561a
Compare
Comments addressed. |
int32_t month) { | ||
if (year < -100000 || year > 100000) | ||
return emitError() << "year must be in a range of [-10,000..10,000] years"; | ||
return success(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, nice! month
should be checked the same way (and be in [120,000..120,000]
).
int32_t seconds) { | ||
if (days < -3650000 || days > 3650000) | ||
return emitError() | ||
<< "days must be in a range of [-3,650,000..3,650,000] days"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for seconds.
"IntervalYearMonthType"> { | ||
let summary = "Substrait interval year to month type"; | ||
let description = [{ | ||
This type represents a substrait interval year to month attribute type. Note that this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please break at 80 characters.
let genVerifyDecl = 1; | ||
} | ||
|
||
def Substrait_IntervalYearMonthAttr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this and the other trailing white spaces in the file. (Consider setting "files.trimTrailingWhitespace": true
in your VS Code config.)
def Substrait_IntervalYearMonthAttr | |
def Substrait_IntervalYearMonthAttr |
This type represents a substrait interval day to second attribute type. Note that this | ||
attribute does not attempt to canonicalize equivalent day-second intervals. | ||
}]; | ||
let parameters = (ins "int32_t":$days_value, "int32_t":$seconds_value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am realizing this late but: do we need the _value
suffix? It's not used in the Substrait spec and I don't see any value (ha!) in having it here...
f8953cb
to
f608fd4
Compare
Hurray! After many rounds the PR made finally it 🎉 |
@ingomueller-net Did you see my comment above? We might need some more work 🫠 |
Which comment? The CI failure due to the DuckDB extension? That's been fixed by #79, no? |
Aha, I understand the problem. (I don't see the comment, though, but not a problem.) As for the data type: the spec for the As for the example in the spec: the difference between the middle and the right column is that the right column has an additional |
I think what happened is that you "started a review" in the right-most tab and didn't submit the review. That review mode is handy if you have a few comments for the same PR, then they get sent as one big batch. But only once you submit it. Until then, the comments are still "pending." |
This PR introduces two new atomic interval types:
Supports a range of [-10,000..10,000] years, with month-level precision. Typically stored as separate integers for years and months (e.g., 12y 3m), but only the total number of months is considered significant. For example, 1y 0m is equivalent to 0y 12m or 1001y -12000m.
Supports a range of [-3,650,000..3,650,000] days, with second precision. Typically stored as separate integers for days and seconds, but only the total number of seconds is considered significant. For example, 1d 0s is equivalent to 0d 86400s.