-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the active expire algorithm on the EXPIRE command page #230
Conversation
This PR addresses issue valkey-io#185 Signed-off-by: Anastasia Alexadrova <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Thanks!
commands/expire.md
Outdated
These keys should be expired anyway, so periodically Valkey tests a few keys at | ||
random among keys with an expire set. | ||
All the keys that are already expired are deleted from the keyspace. | ||
During the "slow cycle", Valkey scans 20 keys per database loop. It tolerates having not more than 10% of the expired keys in the memory and tries to use a maximum of 25% CPU power. These default values are adjusted if the user changes the active expire key effort configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've seen this code comment in expire.c right?
https://github.com/valkey-io/valkey/blob/8.0.2/src/expire.c#L74
If not, I have no idea how you would know all this. 😄
Here, we use the term database loop without first explaining what it means. I think it's too technical. I would just skip everything about database loop. If anyone wants to understand this level of detail, they should look up the source code IMO.
Also the slow and the fast cycle are quite some technical details but these are explained below. But I think we should first explain something more high level about the algorithm, like this from the code comment:
The algorithm used is adaptive and will use few CPU cycles if there are few expiring keys, otherwise it will get more aggressive to avoid that too much memory is used
And that it tries to keep the expired keys below 10% and using not more than 25% CPU time. This is more basic high level info about it. Then, below, go into more details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean we can skip everything related to ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP
.
CRON_DBS_PER_CALL
is 16 (defined in server.h) and it's the number of databases that are scanned in each cron cycle. But many users use only one database so in many cases it's irrelevant. In cluster mode, it's impossible to use more than one database. That's why I think this is mostly too internal. It's basically a safeguard to make sure the server doesn't get stuck if there are too many databases, which is an edge case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've seen this code comment in expire.c right?
https://github.com/valkey-io/valkey/blob/8.0.2/src/expire.c#L74
I did:)
If not, I have no idea how you would know all this. 😄
Here, we use the term database loop without first explaining what it means. I think it's too technical. I would just skip everything about database loop. If anyone wants to understand this level of detail, they should look up the source code IMO.
Agree, I removed it from the description.
Also the slow and the fast cycle are quite some technical details but these are explained below. But I think we should first explain something more high level about the algorithm, like this from the code comment:
Updated.
Co-authored-by: Viktor Söderqvist <[email protected]> Signed-off-by: Anastasia Alexandrova <[email protected]>
Signed-off-by: Anastasia Alexadrova <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, I think this is good to merge. Thanks again!
Change the description of the active expire algorithm to describe the current scan-based approach instead of the old no longer used random sampling approach.
Fixes #185