Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of version metadata table for view #12014

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

huan233usc
Copy link

This PR implements the .version metadata mentioned in #9844

Added integration tests and tested locally using rest catalog
test

@huan233usc
Copy link
Author

Hi, @amogh-jahagirdar @nastra Could I get some feedback for this PR? Thanks

@nastra
Copy link
Contributor

nastra commented Jan 31, 2025

@huan233usc sorry for the delay here. I'll try to review this either today or early next week

@nastra
Copy link
Contributor

nastra commented Feb 4, 2025

@huan233usc could you resolve the merge conflicts please?

@huan233usc
Copy link
Author

merge conflicts resolved


@Override
public FileIO io() {
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty much all methods that aren't supported by views should be throwing a UnsupportedOperationException (similar to how it's done in the super class

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, we should remove all of the concrete implementations that are returning null here.


TableScan scan = viewVersionTable.newScan();

assertThat(scan.schema()).isEqualTo(ViewVersionTable.VIEW_VERSION_SCHEMA);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should have some additional checks, such as

List<ViewMetadataReadTask> scanTasks =
        Lists.newArrayList(
            Iterators.transform(
                scan.planFiles().iterator(),
                task -> {
                  assertThat(task).isInstanceOf(ViewMetadataReadTask.class);
                  return (ViewMetadataReadTask) task;
                }));
    assertThat(scanTasks).hasSize(1);
    ViewVersion viewVersion = viewVersionTable.operations().current().currentVersion();
    assertThat(scanTasks.get(0).rows())
        .containsExactly(
            ViewMetadataReadTask.Row.of(
                viewVersion.versionId(),
                viewVersion.schemaId(),
                viewVersion.timestampMillis() * 1000,
                viewVersion.summary(),
                viewVersion.defaultCatalog(),
                viewVersion.representations(),
                viewVersion.defaultNamespace().toString()));

Also it would be good to test scans against the version table where only one/two columns are projected

public void commit(ViewMetadata base, ViewMetadata updated) {}
}

private TestViews() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be at the top

}

@Override
public void commit(ViewMetadata base, ViewMetadata updated) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably also throw a UOE if commit isn't supported

.versionId(version)
.schemaId(0)
.timestampMillis(timestampMillis)
.defaultNamespace(Namespace.of("foo.bar"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.defaultNamespace(Namespace.of("foo.bar"))
.defaultNamespace(Namespace.of("foo", "bar"))


public static class TestViewOperations implements ViewOperations {

private ViewMetadata metadata;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private ViewMetadata metadata;
private final ViewMetadata metadata;


public static TestView createSampleTestView(String name) {
return new TestView(
new TestViewOperations(/* version= */ 1, /* timestampMillis= */ 1737436503101L), name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the inlined comment typically goes after the actual value, but it should be ok omitting the comment here


assertThat(catalog().viewExists(identifier)).as("View should exist").isTrue();
Table table = tableCatalog().loadTable(TableIdentifier.of("ns", "view", "version"));
assertThat(table).isInstanceOf(ViewVersionTable.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this here should probably also have some additional checks

viewCatalog
.buildView(TableIdentifier.of(NAMESPACE, viewName))
.withQuery("spark", sql)
// use non-existing column name to make sure only the SQL definition for spark is loaded
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is probably not really relevant for this test here

// Similar to table's metadata table, view's metadata table requires fully qualified name.
List<Object[]> result = sql("SELECT * FROM %s.%s.%s.version", catalogName, NAMESPACE, viewName);
assertThat(result).hasSize(1);
assertThat(result.get(0).length).isEqualTo(7);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should add some additional checks here to assert all the values are what we expect. For this, you can store the result of the view creation above in a view variable and then use view.currentVersion() to compare values

@@ -2109,6 +2109,34 @@ public void createViewWithCustomMetadataLocation() throws IOException {
""));
}

@Test
public void readFromViewVersionTable() throws NoSuchTableException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also have a test where only one/two columns are projected

View baseView = loadView(context, baseViewIdentifier);
return MetadataTableUtils.createViewMetadataTableInstance(baseView, tableName, type);
} else {
throw new NoSuchTableException("Table does not exist: %s", identifier);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably throw the original exception that was caught in loadTable

return MetadataTableUtils.createViewMetadataTableInstance(
ops, name(), baseViewIdentifier, identifier, type);
} else {
throw new NoSuchTableException("Table does not exist: %s", identifier);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably also throw the original exception that was caught in loadTable

ViewOperations ops = newViewOps(baseViewIdentifier);
if (ops.current() == null) {
throw new NoSuchTableException("Table or View does not exist: %s", baseViewIdentifier);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please add a new line after the }


@Override
public CloseableIterable<FileScanTask> planFiles() {
// override planFiles to avoid the check for a current snapshot because this metadata table is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this comment applies for views, so it can be removed

this.rows = rows;
}

ViewMetadataReadTask(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't seem to be used

return ImmutableList.of(this);
}

Schema projectedSchema() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this and metadataFile aren't used, so can be removed

@@ -462,6 +463,26 @@ private LoadTableResponse loadInternal(

@Override
public Table loadTable(SessionContext context, TableIdentifier identifier) {
try {
return loadTableInternal(context, identifier);
} catch (NoSuchTableException original) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think is the right way to fallback to loading the metadata table. If you look at the current fallback logic metadata tables, we should attempt loading in that same path.

ViewMetadataTableType type = ViewMetadataTableType.from(tableName);
if (type != null) {
TableIdentifier baseViewIdentifier = TableIdentifier.of(identifier.namespace().levels());
View baseView = loadView(context, baseViewIdentifier);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this will break for catalogs not implementing views, because it's going to throw.

try {
return super.loadTable(identifier);
} catch (NoSuchTableException e) {
return loadViewMetadataTable(identifier);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the other comment. I don't feel this is the right way to handle metadata tables for views. It should be handled as part of the metadata table fallback logic.


@Override
public SortOrder sortOrder() {
return sortOrder;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no sort order for view, this shouldn't be included.


@Override
public Map<String, String> properties() {
return ImmutableMap.of();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Views have properties, we should be exposing them.

Comment on lines +118 to +151
@Override
public Snapshot currentSnapshot() {
return null;
}

@Override
public Iterable<Snapshot> snapshots() {
return ImmutableList.of();
}

@Override
public Snapshot snapshot(long snapshotId) {
return null;
}

@Override
public List<HistoryEntry> history() {
return ImmutableList.of();
}

@Override
public List<StatisticsFile> statisticsFiles() {
return ImmutableList.of();
}

@Override
public List<PartitionStatisticsFile> partitionStatisticsFiles() {
return ImmutableList.of();
}

@Override
public Map<String, SnapshotRef> refs() {
return ImmutableMap.of();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these are not handled properly (similar to above)

@danielcweeks
Copy link
Contributor

I feel like structurally we're not approaching this the right way. We already have a BaseMetadataTable implementation used for metadata tables. View metadata is a table and we should reuse that concept as opposed to creating a new table-like abstraction. Right now that implementation has ties to a table but we might be able to leverage that for views by using ViewHistoryTable extends BaseMetadataTable and wrapping the view information in a table-like structure as opposed to creating a new/parallel structure for view metadata.

@huan233usc
Copy link
Author

Thank you both for the review, will update this pr accordingly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants