-
Notifications
You must be signed in to change notification settings - Fork 602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
datalake: assign field ids to dlq record type #24966
Conversation
306c6e3
to
44b4be7
Compare
src/v/datalake/record_multiplexer.cc
Outdated
@@ -241,6 +241,10 @@ record_multiplexer::operator()(model::record_batch batch) { | |||
ss::future<result<record_multiplexer::write_result, writer_error>> | |||
record_multiplexer::end_of_stream() { | |||
if (_error) { | |||
// TODO: Call finish on all writeres to clean up resources. Redpanda |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The crash is only in debug actually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this related to the change in this PR?
CI test resultstest results on build#61303
test results on build#61338
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commit message is missing description. why are we assigning field ids (or alternatively, what happens if we don't).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Echoing Noah's point, for future readers (maybe our future selves) could you leave some context in the commit message: that this was just an oversight and we're just copying what we do for the main table, and what the behavior was before this (like did queries break before or something)
src/v/datalake/record_multiplexer.cc
Outdated
@@ -241,6 +241,10 @@ record_multiplexer::operator()(model::record_batch batch) { | |||
ss::future<result<record_multiplexer::write_result, writer_error>> | |||
record_multiplexer::end_of_stream() { | |||
if (_error) { | |||
// TODO: Call finish on all writeres to clean up resources. Redpanda |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this related to the change in this PR?
When testing iceberg integration with a proper rest catalog a bug was found which caused translation to fail with the following log messages: ``` Error std::invalid_argument (Expected accessor for field id 5) while partitioning value: struct{struct{int(0), long(4), timestamp(1738071534491000), none, binary(size_bytes=0), }, binary(size_bytes=77), } Error adding data to DLQ writer for record 4: Data Writer Error:1 ``` The problem seems to be caused by the fact that there was a mismatch between the schema we created locally and the partitioning spec returned by the catalog. The schema used for writing (in particular, field ids) should have been taken from the rest catalog. The bug was caused by an omission due to my misunderstanding on the translating machinery. Added the missing call similar for how it is done for main table data writing.
44b4be7
to
a9f515e
Compare
Thanks for adding the description @nvartolomei ! |
Backports Required
Release Notes