-
Notifications
You must be signed in to change notification settings - Fork 175
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
refactor: logical op constructor+builder boundary (#3684)
## The problem Plan ops are created for various reasons through our code - from our dataframe or sql interfaces, to optimization rules, to even op constructors themselves which can sometimes create other ones. All of these cases generally go through the same new/try_new constructor for each op, which tries to accommodate all of these use cases. This creates complexity, adds unnecessary compute to planning time, and also conflates user input errors with Daft internal errors. For example, I don't expect any optimization rules to create unresolved expressions, expression resolution should only be done for the builder. Another example is the Join op, where inputs such as join_prefix and join_suffix are only applicable for renaming columns, which should also only happen via the builder. We recently added another initializer to some ops for that reason, but it bypasses the validation that is typically done and is not standardized across ops. ## My solution Every op should provide a `try_new` constructor which contain explicit checks for all the requirements about the op's state (one example would be that all expression columns exist in the schema), but otherwise should simply put those values into the struct without any modification and return it. - Functions such as `LogicalPlan::with_new_children` will just call `try_new`. - Other constructors/helpers may exist that explicitly provide additional functionality and ultimately call `try_new`. E.g. a `Join::rename_right_columns` to rename the right side columns that conflict with the left side, called to update the right side schema before calling `try_new`. - User input normalization, such as expression resolution, should be handled by the logical plan builder. After the logical plan op has been constructed, everything should be in a valid state from there on.
- Loading branch information
1 parent
beae462
commit 3720c2a
Showing
29 changed files
with
345 additions
and
450 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.