Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update sequence number assignment method in WBWIMemTable #13400

Closed
wants to merge 2 commits into from

Conversation

cbi42
Copy link
Member

@cbi42 cbi42 commented Feb 14, 2025

Summary: This is a preparation for supporting merge in WBWIMemTable. This PR updates the sequence number assignment method so that it allows efficient and simple assignment when there are multiple entries with the same user key. This can happen when the WBWI contains Merge operations. This assignment relies on tracking the number of updates issued for each key in each WBWI entry (WriteBatchIndexEntry::update_count). Some refactoring is done in WBWI to remove last_entry_offset as part of the WBWI state which I find it harder to use correctly.

Test plan: updated unit tests to check that update count is tracked correctly and WBWIMemTable is assigning sequence number as expected.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@pdillinger pdillinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@@ -161,6 +163,9 @@ struct WriteBatchIndexEntry {
// SeekForPrev() will see all the keys with the same key.
size_t offset;
uint32_t column_family; // column family of the entry.
// The following three fields are only maintained when the WBWI is created
// with overwrite_key = true.
uint32_t update_count; // number of updates for this key up to this entry.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's mention explicitly somewhere that this is a 1-based count, not 0-based

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, updated comments.

//
// WBWI with overwrite mode keeps track of the most recent update for each key,
// so this memtable contains one update per key usually. However, there is a
// special case where this memtable needs to emit an extra SingleDelete even
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume discussion of special case for merge comes later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will come later. It should be a small change where for merge we don't track only the most recent update.

@cbi42 cbi42 force-pushed the wbwi-seqno-update-count branch from c54ef30 to 45e9204 Compare February 19, 2025 18:34
@facebook-github-bot
Copy link
Contributor

@cbi42 has updated the pull request. You must reimport the pull request before landing.

Copy link
Member Author

@cbi42 cbi42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review.

//
// WBWI with overwrite mode keeps track of the most recent update for each key,
// so this memtable contains one update per key usually. However, there is a
// special case where this memtable needs to emit an extra SingleDelete even
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will come later. It should be a small change where for merge we don't track only the most recent update.

@@ -161,6 +163,9 @@ struct WriteBatchIndexEntry {
// SeekForPrev() will see all the keys with the same key.
size_t offset;
uint32_t column_family; // column family of the entry.
// The following three fields are only maintained when the WBWI is created
// with overwrite_key = true.
uint32_t update_count; // number of updates for this key up to this entry.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, updated comments.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@cbi42 merged this pull request in 36838bb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants