Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datalake: assign field ids to dlq record type #24966

Merged
merged 1 commit into from
Jan 29, 2025

Conversation

nvartolomei
Copy link
Contributor

@nvartolomei nvartolomei commented Jan 28, 2025

datalake: assign field ids to dlq record type

When testing iceberg integration with a proper rest catalog a bug was
found which caused translation to fail with the following log messages:

```
Error std::invalid_argument (Expected accessor for field id 5) while partitioning value: struct{struct{int(0), long(4), timestamp(1738071534491000), none, binary(size_bytes=0), }, binary(size_bytes=77), }
Error adding data to DLQ writer for record 4: Data Writer Error:1
```

The problem seems to be caused by the fact that there was a mismatch
between the schema we created locally and the partitioning spec returned
by the catalog.

The schema used for writing (in particular, field ids) should have been
taken from the rest catalog. The bug was caused by an omission due to my
misunderstanding on the translating machinery.

Added the missing call similar for how it is done for main table data
writing.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.3.x
  • v24.2.x
  • v24.1.x

Release Notes

  • none

@@ -241,6 +241,10 @@ record_multiplexer::operator()(model::record_batch batch) {
ss::future<result<record_multiplexer::write_result, writer_error>>
record_multiplexer::end_of_stream() {
if (_error) {
// TODO: Call finish on all writeres to clean up resources. Redpanda
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The crash is only in debug actually.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to the change in this PR?

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Jan 28, 2025

CI test results

test results on build#61303
test_id test_kind job_url test_status passed
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/61303#0194adb3-3719-47fb-ba42-ac178098fd2a FLAKY 1/2
rptest.tests.datalake.compaction_test.CompactionGapsTest.test_translation_no_gaps.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/61303#0194adb2-84b8-47b3-9880-c6debf86772f FLAKY 1/3
rptest.tests.e2e_shadow_indexing_test.ShadowIndexingWhileBusyTest.test_create_or_delete_topics_while_busy.short_retention=True.cloud_storage_type=CloudStorageType.ABS ducktape https://buildkite.com/redpanda/redpanda/builds/61303#0194adb2-84ba-4fda-ad3e-90d7142facab FLAKY 1/2
test results on build#61338
test_id test_kind job_url test_status passed
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/61338#0194b19a-849d-4a14-83c2-5d9eb1bc1b88 FLAKY 1/3
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/61338#0194b1b6-04cb-40bd-886d-d90683afbaf9 FLAKY 1/2
rptest.tests.compaction_recovery_test.CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade ducktape https://buildkite.com/redpanda/redpanda/builds/61338#0194b19a-849e-4100-85cb-cc4b588bdbcb FLAKY 1/2
rptest.tests.compaction_recovery_test.CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade ducktape https://buildkite.com/redpanda/redpanda/builds/61338#0194b1b6-04c9-4a9a-9988-0c0501f99851 FLAKY 1/2
storage_single_thread_rpunit.storage_single_thread_rpunit unit https://buildkite.com/redpanda/redpanda/builds/61338#0194b156-6203-4c4b-b383-9dece797ff6a FLAKY 1/2

Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit message is missing description. why are we assigning field ids (or alternatively, what happens if we don't).

Copy link
Contributor

@andrwng andrwng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Echoing Noah's point, for future readers (maybe our future selves) could you leave some context in the commit message: that this was just an oversight and we're just copying what we do for the main table, and what the behavior was before this (like did queries break before or something)

@@ -241,6 +241,10 @@ record_multiplexer::operator()(model::record_batch batch) {
ss::future<result<record_multiplexer::write_result, writer_error>>
record_multiplexer::end_of_stream() {
if (_error) {
// TODO: Call finish on all writeres to clean up resources. Redpanda
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to the change in this PR?

When testing iceberg integration with a proper rest catalog a bug was
found which caused translation to fail with the following log messages:

```
Error std::invalid_argument (Expected accessor for field id 5) while partitioning value: struct{struct{int(0), long(4), timestamp(1738071534491000), none, binary(size_bytes=0), }, binary(size_bytes=77), }
Error adding data to DLQ writer for record 4: Data Writer Error:1
```

The problem seems to be caused by the fact that there was a mismatch
between the schema we created locally and the partitioning spec returned
by the catalog.

The schema used for writing (in particular, field ids) should have been
taken from the rest catalog. The bug was caused by an omission due to my
misunderstanding on the translating machinery.

Added the missing call similar for how it is done for main table data
writing.
@nvartolomei
Copy link
Contributor Author

@dotnwat @andrwng done! 👍

@nvartolomei nvartolomei merged commit 090f23f into redpanda-data:dev Jan 29, 2025
19 checks passed
@dotnwat
Copy link
Member

dotnwat commented Jan 30, 2025

@dotnwat @andrwng done! 👍

Thanks for adding the description @nvartolomei !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants