-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core: add variant type support #11831
base: main
Are you sure you want to change the base?
Conversation
5a18496
to
68d690e
Compare
cc @rdblue, @RussellSpitzer, @flyrain and @JonasJ-ap. This is to add the changes in core to support variant type. |
core/src/test/java/org/apache/iceberg/TestMetadataUpdateParser.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/apache/iceberg/avro/TestAvroSchemaProjection.java
Outdated
Show resolved
Hide resolved
68d690e
to
04958ca
Compare
b276d3f
to
fe6038a
Compare
@aihuaxu very important feature that will allow a lot more options for iceberg, thank you for your contribution |
core/src/main/java/org/apache/iceberg/avro/BuildAvroProjection.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/apache/iceberg/TestMetadataUpdateParser.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/apache/iceberg/avro/TestBuildAvroProjection.java
Outdated
Show resolved
Hide resolved
e0a430f
to
1b56bf5
Compare
core/src/test/java/org/apache/iceberg/TestSchemaUnionByFieldName.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/apache/iceberg/avro/TestBuildAvroProjection.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/apache/iceberg/avro/TestSchemaConversions.java
Outdated
Show resolved
Hide resolved
data/src/test/java/org/apache/iceberg/data/avro/TestGenericData.java
Outdated
Show resolved
Hide resolved
new Schema(required(1, "v", testType), required(2, "A", Types.IntegerType.get())); | ||
|
||
Schema assignedSchema = | ||
TypeUtil.assignFreshIds(sourceSchema, new AtomicInteger(10)::incrementAndGet); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't use sourceSchema
as intended. ID assignment with a source schema uses assigned IDs from that schema when names match and assigns missing/new IDs using the NextID
supplier. When using the single schema version, the second schema should not be present.
That also raises the question about what we need to test for new types. In this case, because the type just needs to be handled by field
and struct
, I think it's fine to just have a case that validates that UnsupportedOperationException
is not thrown and that the behavior is expected. That is, we only need on test with a variant here. Other tests with more logic for primitives may need more cases though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I just need one schema. I may not get the following: seems we can test for unknown and other primitive types here as well to make sure the correct type is returned, not null.
we only need on test with a variant here. Other tests with more logic for primitives may need more cases though.
574246d
to
a8261c7
Compare
a8261c7
to
93586e3
Compare
@@ -187,6 +187,18 @@ public Schema map(Types.MapType map, Schema keySchema, Schema valueSchema) { | |||
return mapSchema; | |||
} | |||
|
|||
@Override | |||
public Schema variant(Types.VariantType variant) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is included (and tested) in #12238. Can you remove it from here?
core/src/main/java/org/apache/iceberg/schema/SchemaWithPartnerVisitor.java
Show resolved
Hide resolved
core/src/main/java/org/apache/iceberg/schema/SchemaWithPartnerVisitor.java
Show resolved
Hide resolved
core/src/test/java/org/apache/iceberg/avro/TestSchemaConversions.java
Outdated
Show resolved
Hide resolved
@@ -712,6 +712,10 @@ public T map(Types.MapType map, Supplier<T> keyResult, Supplier<T> valueResult) | |||
return null; | |||
} | |||
|
|||
public T variant(Types.VariantType variant) { | |||
throw new UnsupportedOperationException("Unsupported type: variant"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is not yet implemented by:
CheckCompatibility
(tested byTestReadabilityChecks
)FixupTypes
,SparkFixupTypes
, orFlinkFixupTypes
PruneColumnsWithoutReordering
(Spark, so maybe this can be done separately)ReassignDoc
AssignIds
It is also not implemented by PruneColumnsWithReordering
but that doesn't appear to be referenced anywhere. I opened #12258 to remove these.
public T variant() { | ||
return null; | ||
public T variant(Types.VariantType variant) { | ||
throw new UnsupportedOperationException("Unsupported type: variant"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still need to be implemented by:
CreateMapping
DescribeSchemaVisitor
IdToOrcName
and other conversions
I think it's fine to implement the conversions later as part of Avro, Parquet, and ORC work, but the common ones should probably be implemented.
This is to add some required changes in API and core module for Variant support, including:
Part of: #10392