-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement script that produces stats on implementation coverage [WIP] #56
Open
ingomueller-net
wants to merge
10
commits into
substrait-io:main
Choose a base branch
from
ingomueller-net:coverage-stats
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
feat: implement script that produces stats on implementation coverage [WIP] #56
ingomueller-net
wants to merge
10
commits into
substrait-io:main
from
ingomueller-net:coverage-stats
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
8a364dc
to
1e1336f
Compare
8ee0ad7
to
8a30c4d
Compare
8a30c4d
to
c112063
Compare
This PR introduces support for the `PlanVersion` message type. This is done by introduces the `VersionAttr`, which corresponds to the `Version` message type, which, in turn, is the type of the only field of `PlanVersion`. Since `PlanVersion` is a top-level message, this PR extends the import mechanism: when reading a protobuf message, we need to know which type it is, so when we translate a protobuf message into MLIR, we need to specify which message type the input is. The PR, thus, adds a new flag to the `substrait-translate` tool, together with a corresponding top-level translation function. This also slightly changed some existing error messages and the corresponding tests. This PR does not yet factor out the handling of the version of the `Plan` message type. That should follow up soon but, since the new `VersionAttr` has a slightly different assembly format than the corresponding part of the `PlanOp`, that change will require to modify all test cases, which would bloat this PR. Signed-off-by: Ingo Müller <[email protected]>
This PR adds support for the `expected_type_urls` field of the `Plan` message. This is pretty straight-forward: the repeated field of string becomes an `ArrayAttr` of `StringAttr`s. We don't really need this field currently but it's the last missing field of the `Plan` message, so it allows us to tick that message type off. Signed-off-by: Ingo Müller <[email protected]>
This PR factors out the handling of the `shared_extension` field from the `plan` op and adds that logic to the `project` op. This mainly consisted of moving the import and export logic from the functions related to the `plan` op to dedicated functions. The PR also introduces the new `ExtensibleOpInterface` that enforces an attribute called `advanced_extension` on the op that implement it and allows to deal with all such ops transparently. Since that interface depends on an attribute, the include order of the generated code of interfaces and attributes also had to be adapted. Unfortunately, the field names in the Substrait spec also vary (singular or plural), so the PR also introduces some template magic to be able to deal with protobuf message types with both spellings. With this PR, message types with an `advanced_extension` field should be able to support it by (1) adding the `ExtensibleOpInterface` to their traits, (2) adding an `advanced_extension` parameter, and (3) adding that parameter to their assembly format (although that's technically optional; otherwise, the attribute is set through the `attributes` dictionary).
c112063
to
e19a239
Compare
This PR adds support for the `AggregateRel` from the Substrait spec in the form of the `aggregate` op. This is arguably the most complex op implemented so far. It has an optional enum argument that requires custom parsing, several optional regions that require custom parsing, an attribute that depends on the presence and contents of the regions and requires custom parsing to omit it in the common case, and return types that depend on the two regions and the attribute. What's more, the current version of the spec is such that it is almost impossibly to interpret "grouping sets" because it relies on protobuf message equality, which is something can protobuf does not offer. The current implementation, thus, implements a best effort by using op equality instead (but needs to run CSE during export to ensure op uniqueness). Finally, the PR also extends the `call` op to represent also `AggregateFunction` messages (in addition to `ScalarFunction` messages), which are used by the new `aggregate` op.
This PR adds the `aggregation_phase` to the `call` op, which corresponds to the `phase` field of the `AggregationFunction` message. The PR also revisits the assembly format of the `aggregation_invocation` attribute of the `call` op and now handles the two jointly such that either of the two can be ommitted if it has the default value. Finally, the PR revisits the LIT tests such that both aggregation detauls are tested jointly and explicitly instead of being piggy-backed onto the remaining tests. This is the last non-type feature we need to run TPC-H Q6 as produced by Isthmus.
This PR implements the `extension_table` op, the MLIR equivalent of the `ReadRel.ExtensionTable` message. Since that message uses the `google.protobuf.Any` message type, the PR also slightly extends and improves the handling of that message in the dialect. Finally, because `extension_table` is the second op corresponding to a `ReadRel` case (after `named_table`), the PR makes some effort to factor out common logic between the two, namely, how the named structs representing the schema of the op are handled. However, there is still some opportunity for factoring out further that the PR does not do, such as defining a `ReadRelInterface`.
This PR makes some clean-ups to `SubstraitEnums.td`: * Reformat (mostly) according to `clang-format`. * Remove unnecessary `Kind` suffix in `JoinTypeKind`. * Move `summary` property of enum `def`s to comment. * Rephrase those comments to make them clearer.
This PR implements the `cast` op, which corresponds to the `Cast` expression in Substrait. This is one of the few remaining message types we need to support to get a first TPC-H query round-tripped.
e19a239
to
2174d35
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.