Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kernel] Add icebergCompatV2 metadata/protocol checks #4199

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

vkorukanti
Copy link
Collaborator

Description

WIP: support for validating icebergCompatV2 rmetadata requirements are satisfied

How was this patch tested?

In progress

Copy link
Collaborator

@allisonport-db allisonport-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall structure makes sense to me, but didn't take a super detailed look at most of the implementations

}
};

private static class IcebergCompatMetadataValidatorAndUpdater {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is our version of IcebergCompat right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might help just organization wise to put all the generic interfaces in one file, and then for each iceberg compat table feature define the implementations in their own file. But maybe we can also just group these elements differently not sure.

But I think IcebergCompatMetadataValidatorAndUpdater seems fine to be it's own top-level class

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separated the validator and updated into a separate class

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to remove this code that is now in the separate file? Sorry left comments here


/** Wrapper class for table property validation */
private static class RequiredDeltaTablePropertyEnforcer {
public final String propertyName;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe take a TableConfig instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started with TableConfig, but ended with Java generics compilation issue. The code has changed a bit since last I tried. Give it a try.

Copy link
Collaborator

@allisonport-db allisonport-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall structure and approach makes sense to me! only open question is about the order of updating the configuration vs the schema (column mapping related)

Can we move the iceberg compat checks earlier such that they only touch the metadata.configuration?

And then after, in one step, we update the schema with the column mapping column metadata (applicable to all column mapping tables not just iceberg compat ones)?

Not sure if there is a specific reason this will not work

new PostMetadataProcessor() {
@Override
Optional<Metadata> postProcess(Metadata newMetadata, boolean isCreatingNewTable) {
// TODO: assign column mapping field ids
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we do this elsewhere later on if we just enable column mapping mode in the metadata? (We have to do this for non-IcebergCompatV2 tables anyways right)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we assign the field IDs if they are not already assigned. Calling it multiple times is not an issue. Also this stems from the lack of specific order of checking and making metadata updates.

Comment on lines +222 to +223
for (RequiredDeltaTablePropertyEnforcer requiredDeltaTableProperty :
requiredDeltaTableProperties) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these refer to independent properties there's never going to be dependencies between the two right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so far thats the assumption.

}
};

private static class IcebergCompatMetadataValidatorAndUpdater {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to remove this code that is now in the separate file? Sorry left comments here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe adjust the names of the two files that are the same names? it's confusing. Maybe the other one will just be IcebergCompatUtils in the future and house pointers to all our versions + validators?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe you meant to remove the other one I'm not sure good with either

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically the IcebergCompatMetadataValidatorAndUpdater contains the infra and interfaces and IcebergCompatV2Utils just for the icebergCompatV2 checks

} else {
// In all other cases, if the property value is not compatible
// with the IcebergV1 requirements, we fail
throw new UnsupportedOperationException(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make a DeltaErrors method

// check for required dependency table features
for (TableFeature requiredDependencyTableFeature : requiredDependencyTableFeatures) {
if (!inputContext.newProtocol.supportsFeature(requiredDependencyTableFeature)) {
throw new UnsupportedOperationException(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make a DeltaErrors method

});

if (!matches.isEmpty()) {
throw new UnsupportedOperationException(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make a DeltaErrors method

|| dataType instanceof TimestampType
|| dataType instanceof TimestampNTZType;
if (!validType) {
throw new UnsupportedOperationException(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make a DeltaErrors method

if (inputContext.newProtocol.supportsFeature(TableFeatures.TYPE_WIDENING_RW_FEATURE)) {
// TODO: Currently Kernel has no support for writing with type widening. When it is
// supported extend this to allow a whitelist of supported type widening in Iceberg
throw new UnsupportedOperationException(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make a DeltaErrors method

private static final IcebergCompatCheck ICEBERG_COMPAT_V2_CHECK_HAS_NO_DELETION_VECTORS =
(inputContext) -> {
if (inputContext.newProtocol.supportsFeature(TableFeatures.DELETION_VECTORS_RW_FEATURE)) {
throw new UnsupportedOperationException(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make a DeltaErrors method

Comment on lines +222 to +223
for (RequiredDeltaTablePropertyEnforcer requiredDeltaTableProperty :
requiredDeltaTableProperties) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so far thats the assumption.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically the IcebergCompatMetadataValidatorAndUpdater contains the infra and interfaces and IcebergCompatV2Utils just for the icebergCompatV2 checks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants