datalake: assign field ids to dlq record type #24966

nvartolomei · 2025-01-28T14:56:47Z

datalake: assign field ids to dlq record type

When testing iceberg integration with a proper rest catalog a bug was
found which caused translation to fail with the following log messages:

```
Error std::invalid_argument (Expected accessor for field id 5) while partitioning value: struct{struct{int(0), long(4), timestamp(1738071534491000), none, binary(size_bytes=0), }, binary(size_bytes=77), }
Error adding data to DLQ writer for record 4: Data Writer Error:1
```

The problem seems to be caused by the fact that there was a mismatch
between the schema we created locally and the partitioning spec returned
by the catalog.

The schema used for writing (in particular, field ids) should have been
taken from the rest catalog. The bug was caused by an omission due to my
misunderstanding on the translating machinery.

Added the missing call similar for how it is done for main table data
writing.

Backports Required

Release Notes

none

nvartolomei · 2025-01-28T15:06:15Z

src/v/datalake/record_multiplexer.cc

@@ -241,6 +241,10 @@ record_multiplexer::operator()(model::record_batch batch) {
 ss::future<result<record_multiplexer::write_result, writer_error>>
 record_multiplexer::end_of_stream() {
    if (_error) {
+        // TODO: Call finish on all writeres to clean up resources. Redpanda


The crash is only in debug actually.

Is this related to the change in this PR?

vbotbuildovich · 2025-01-28T18:25:48Z

CI test results

test results on build#61303

test_id	test_kind	job_url	test_status	passed
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery	ducktape	https://buildkite.com/redpanda/redpanda/builds/61303#0194adb3-3719-47fb-ba42-ac178098fd2a	FLAKY	1/2
rptest.tests.datalake.compaction_test.CompactionGapsTest.test_translation_no_gaps.cloud_storage_type=CloudStorageType.S3	ducktape	https://buildkite.com/redpanda/redpanda/builds/61303#0194adb2-84b8-47b3-9880-c6debf86772f	FLAKY	1/3
rptest.tests.e2e_shadow_indexing_test.ShadowIndexingWhileBusyTest.test_create_or_delete_topics_while_busy.short_retention=True.cloud_storage_type=CloudStorageType.ABS	ducktape	https://buildkite.com/redpanda/redpanda/builds/61303#0194adb2-84ba-4fda-ad3e-90d7142facab	FLAKY	1/2

test results on build#61338

test_id	test_kind	job_url	test_status	passed
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery	ducktape	https://buildkite.com/redpanda/redpanda/builds/61338#0194b19a-849d-4a14-83c2-5d9eb1bc1b88	FLAKY	1/3
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery	ducktape	https://buildkite.com/redpanda/redpanda/builds/61338#0194b1b6-04cb-40bd-886d-d90683afbaf9	FLAKY	1/2
rptest.tests.compaction_recovery_test.CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade	ducktape	https://buildkite.com/redpanda/redpanda/builds/61338#0194b19a-849e-4100-85cb-cc4b588bdbcb	FLAKY	1/2
rptest.tests.compaction_recovery_test.CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade	ducktape	https://buildkite.com/redpanda/redpanda/builds/61338#0194b1b6-04c9-4a9a-9988-0c0501f99851	FLAKY	1/2
storage_single_thread_rpunit.storage_single_thread_rpunit	unit	https://buildkite.com/redpanda/redpanda/builds/61338#0194b156-6203-4c4b-b383-9dece797ff6a	FLAKY	1/2

dotnwat

commit message is missing description. why are we assigning field ids (or alternatively, what happens if we don't).

andrwng

LGTM. Echoing Noah's point, for future readers (maybe our future selves) could you leave some context in the commit message: that this was just an oversight and we're just copying what we do for the main table, and what the behavior was before this (like did queries break before or something)

andrwng · 2025-01-28T19:12:05Z

src/v/datalake/record_multiplexer.cc

@@ -241,6 +241,10 @@ record_multiplexer::operator()(model::record_batch batch) {
 ss::future<result<record_multiplexer::write_result, writer_error>>
 record_multiplexer::end_of_stream() {
    if (_error) {
+        // TODO: Call finish on all writeres to clean up resources. Redpanda


Is this related to the change in this PR?

When testing iceberg integration with a proper rest catalog a bug was found which caused translation to fail with the following log messages: ``` Error std::invalid_argument (Expected accessor for field id 5) while partitioning value: struct{struct{int(0), long(4), timestamp(1738071534491000), none, binary(size_bytes=0), }, binary(size_bytes=77), } Error adding data to DLQ writer for record 4: Data Writer Error:1 ``` The problem seems to be caused by the fact that there was a mismatch between the schema we created locally and the partitioning spec returned by the catalog. The schema used for writing (in particular, field ids) should have been taken from the rest catalog. The bug was caused by an omission due to my misunderstanding on the translating machinery. Added the missing call similar for how it is done for main table data writing.

nvartolomei · 2025-01-29T09:13:06Z

@dotnwat @andrwng done! 👍

dotnwat · 2025-01-30T04:59:11Z

@dotnwat @andrwng done! 👍

Thanks for adding the description @nvartolomei !

nvartolomei requested review from andrwng and ztlpn January 28, 2025 14:56

github-actions bot added the area/redpanda label Jan 28, 2025

nvartolomei force-pushed the nv/dlq-schema-bug branch from 306c6e3 to 44b4be7 Compare January 28, 2025 14:59

nvartolomei commented Jan 28, 2025

View reviewed changes

dotnwat reviewed Jan 28, 2025

View reviewed changes

andrwng reviewed Jan 28, 2025

View reviewed changes

nvartolomei force-pushed the nv/dlq-schema-bug branch from 44b4be7 to a9f515e Compare January 29, 2025 09:12

nvartolomei requested review from andrwng and dotnwat January 29, 2025 09:13

andrwng approved these changes Jan 29, 2025

View reviewed changes

nvartolomei merged commit 090f23f into redpanda-data:dev Jan 29, 2025
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datalake: assign field ids to dlq record type #24966

datalake: assign field ids to dlq record type #24966

nvartolomei commented Jan 28, 2025 •

edited

Loading

nvartolomei Jan 28, 2025

andrwng Jan 28, 2025

vbotbuildovich commented Jan 28, 2025 •

edited

Loading

dotnwat left a comment

andrwng left a comment

andrwng Jan 28, 2025

nvartolomei commented Jan 29, 2025

dotnwat commented Jan 30, 2025

datalake: assign field ids to dlq record type #24966

datalake: assign field ids to dlq record type #24966

Conversation

nvartolomei commented Jan 28, 2025 • edited Loading

Backports Required

Release Notes

nvartolomei Jan 28, 2025

Choose a reason for hiding this comment

andrwng Jan 28, 2025

Choose a reason for hiding this comment

vbotbuildovich commented Jan 28, 2025 • edited Loading

CI test results

dotnwat left a comment

Choose a reason for hiding this comment

andrwng left a comment

Choose a reason for hiding this comment

andrwng Jan 28, 2025

Choose a reason for hiding this comment

nvartolomei commented Jan 29, 2025

dotnwat commented Jan 30, 2025

nvartolomei commented Jan 28, 2025 •

edited

Loading

vbotbuildovich commented Jan 28, 2025 •

edited

Loading