Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the active expire algorithm on the EXPIRE command page #230

Merged
merged 3 commits into from
Feb 20, 2025

Conversation

nastena1606
Copy link
Collaborator

@nastena1606 nastena1606 commented Feb 11, 2025

Change the description of the active expire algorithm to describe the current scan-based approach instead of the old no longer used random sampling approach.

Fixes #185

This PR addresses issue valkey-io#185

Signed-off-by: Anastasia Alexadrova <[email protected]>
Copy link
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Thanks!

These keys should be expired anyway, so periodically Valkey tests a few keys at
random among keys with an expire set.
All the keys that are already expired are deleted from the keyspace.
During the "slow cycle", Valkey scans 20 keys per database loop. It tolerates having not more than 10% of the expired keys in the memory and tries to use a maximum of 25% CPU power. These default values are adjusted if the user changes the active expire key effort configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've seen this code comment in expire.c right?

https://github.com/valkey-io/valkey/blob/8.0.2/src/expire.c#L74

If not, I have no idea how you would know all this. 😄

Here, we use the term database loop without first explaining what it means. I think it's too technical. I would just skip everything about database loop. If anyone wants to understand this level of detail, they should look up the source code IMO.

Also the slow and the fast cycle are quite some technical details but these are explained below. But I think we should first explain something more high level about the algorithm, like this from the code comment:

The algorithm used is adaptive and will use few CPU cycles if there are few expiring keys, otherwise it will get more aggressive to avoid that too much memory is used

And that it tries to keep the expired keys below 10% and using not more than 25% CPU time. This is more basic high level info about it. Then, below, go into more details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean we can skip everything related to ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP.

CRON_DBS_PER_CALL is 16 (defined in server.h) and it's the number of databases that are scanned in each cron cycle. But many users use only one database so in many cases it's irrelevant. In cluster mode, it's impossible to use more than one database. That's why I think this is mostly too internal. It's basically a safeguard to make sure the server doesn't get stuck if there are too many databases, which is an edge case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've seen this code comment in expire.c right?

https://github.com/valkey-io/valkey/blob/8.0.2/src/expire.c#L74

I did:)

If not, I have no idea how you would know all this. 😄

Here, we use the term database loop without first explaining what it means. I think it's too technical. I would just skip everything about database loop. If anyone wants to understand this level of detail, they should look up the source code IMO.

Agree, I removed it from the description.

Also the slow and the fast cycle are quite some technical details but these are explained below. But I think we should first explain something more high level about the algorithm, like this from the code comment:
Updated.

nastena1606 and others added 2 commits February 20, 2025 17:24
Co-authored-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Anastasia Alexandrova <[email protected]>
Signed-off-by: Anastasia Alexadrova <[email protected]>
Copy link
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I think this is good to merge. Thanks again!

@zuiderkwast zuiderkwast changed the title Update the Expire keys section Update the description of the active expire algorithm on the EXPIRE command page Feb 20, 2025
@zuiderkwast zuiderkwast changed the title Update the description of the active expire algorithm on the EXPIRE command page Update the active expire algorithm on the EXPIRE command page Feb 20, 2025
@zuiderkwast zuiderkwast merged commit 3e334ea into valkey-io:main Feb 20, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EXPIRE command docs incorrect says active expire uses random sampling
2 participants