You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
These notes reflect things I am personally involved in or thinking about and may not cover all activities. Feel free to add comments for anything that I missed.
The proof-of-concept work for adding new native_datafusion and native_iceberg_compat implementations of the Parquet scan has now been merged into the main branch. These implementations can be enabled via the new COMET_NATIVE_SCAN_IMPL configuration setting and are disabled by default because they are not yet complete.
The focus now is on getting all existing tests to pass with these new scans enabled. Then, we can move on to adding tests for reading complex types (arrays, structs, maps) from Parquet files.
Improving type-checking and test coverage
An initial PR was merged, which refactored QueryPlanSerde.scala to move some inner methods to top-level methods, making them accessible from unit tests, thus making it easier to write unit tests per expression.
There is a follow-up draft PR to improve test coverage for array expressions. Feedback on the approach is welcome before this is extended to improve tests for all other expressions.
Array expressions
There are several PRs ready for review for adding more array expressions.
Upgrade to DataFusion 45
There is a draft PR to prepare to upgrade to DataFusion 45.
The text was updated successfully, but these errors were encountered:
Introduction
These notes reflect things I am personally involved in or thinking about and may not cover all activities. Feel free to add comments for anything that I missed.
Previous week's issue: #1305
comet-parquet-exec
The proof-of-concept work for adding new
native_datafusion
andnative_iceberg_compat
implementations of the Parquet scan has now been merged into the main branch. These implementations can be enabled via the newCOMET_NATIVE_SCAN_IMPL
configuration setting and are disabled by default because they are not yet complete.The focus now is on getting all existing tests to pass with these new scans enabled. Then, we can move on to adding tests for reading complex types (arrays, structs, maps) from Parquet files.
Improving type-checking and test coverage
An initial PR was merged, which refactored QueryPlanSerde.scala to move some inner methods to top-level methods, making them accessible from unit tests, thus making it easier to write unit tests per expression.
There is a follow-up draft PR to improve test coverage for array expressions. Feedback on the approach is welcome before this is extended to improve tests for all other expressions.
Array expressions
There are several PRs ready for review for adding more array expressions.
Upgrade to DataFusion 45
There is a draft PR to prepare to upgrade to DataFusion 45.
The text was updated successfully, but these errors were encountered: