Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the active expire algorithm on the EXPIRE command page #230

Merged
merged 3 commits into from
Feb 20, 2025
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 21 additions & 19 deletions commands/expire.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,32 +138,34 @@ you set a key with a time to live of 1000 seconds, and then set your computer
time 2000 seconds in the future, the key will be expired immediately, instead of
lasting for 1000 seconds.

## How Valkey expires keys
## How Valkey reclaims expired keys

Valkey keys are expired in two ways: a passive way, and an active way.
Valkey reclaims expired keys in two ways: on access and in the background in what is called the "active expire key" cycles. On access expiration is when a client tries to access a key with the expiration time which is found to be timed out. Such a key is deleted on this access attempt.

A key is passively expired simply when some client tries to access it, and the
key is found to be timed out.
Relying solely on the on access expiration only is not enough because there are expired keys that will never be accessed again. To address this, Valkey uses the background expiration known as the "active expire key" effort. Valkey slowly and interactively scans the keyspace to identify and reclaim expired keys. This slow cycle is the main way to collect expired keys and operates with the server's hertz frequency (usually 10 hertz).

Of course this is not enough as there are expired keys that will never be
accessed again.
These keys should be expired anyway, so periodically Valkey tests a few keys at
random among keys with an expire set.
All the keys that are already expired are deleted from the keyspace.
During the "slow cycle", Valkey scans 20 keys per database loop. It tolerates having not more than 10% of the expired keys in the memory and tries to use a maximum of 25% CPU power. These default values are adjusted if the user changes the active expire key effort configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've seen this code comment in expire.c right?

https://github.com/valkey-io/valkey/blob/8.0.2/src/expire.c#L74

If not, I have no idea how you would know all this. 😄

Here, we use the term database loop without first explaining what it means. I think it's too technical. I would just skip everything about database loop. If anyone wants to understand this level of detail, they should look up the source code IMO.

Also the slow and the fast cycle are quite some technical details but these are explained below. But I think we should first explain something more high level about the algorithm, like this from the code comment:

The algorithm used is adaptive and will use few CPU cycles if there are few expiring keys, otherwise it will get more aggressive to avoid that too much memory is used

And that it tries to keep the expired keys below 10% and using not more than 25% CPU time. This is more basic high level info about it. Then, below, go into more details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean we can skip everything related to ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP.

CRON_DBS_PER_CALL is 16 (defined in server.h) and it's the number of databases that are scanned in each cron cycle. But many users use only one database so in many cases it's irrelevant. In cluster mode, it's impossible to use more than one database. That's why I think this is mostly too internal. It's basically a safeguard to make sure the server doesn't get stuck if there are too many databases, which is an edge case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've seen this code comment in expire.c right?

https://github.com/valkey-io/valkey/blob/8.0.2/src/expire.c#L74

I did:)

If not, I have no idea how you would know all this. 😄

Here, we use the term database loop without first explaining what it means. I think it's too technical. I would just skip everything about database loop. If anyone wants to understand this level of detail, they should look up the source code IMO.

Agree, I removed it from the description.

Also the slow and the fast cycle are quite some technical details but these are explained below. But I think we should first explain something more high level about the algorithm, like this from the code comment:
Updated.


Specifically this is what Valkey does 10 times per second:
If the number of expired keys remains high after the slow cycle, the active expire key effort transitions into the "fast cycle", trying to do less work but more often. The fast cycle runs no longer than 1000 microseconds and repeats at the same interval. During the fast cycle, the check of every database is interrupted once the number of already expired keys in the database is estimated to be lower than 10%. This is done to avoid doing too much work to gain too little memory.

1. Test 20 random keys from the set of keys with an associated expire.
2. Delete all the keys found expired.
3. If more than 25% of keys were expired, start again from step 1.
You can modify the active expire key effort with the `active-expire-effort` parameter in the configuration file up to the maximum value of `10`. The default `active-expire-effort` value is `1`, and it is described by the following base values:

This is a trivial probabilistic algorithm, basically the assumption is that our
sample is representative of the whole key space, and we continue to expire until
the percentage of keys that are likely to be expired is under 25%
* `ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP` = 20 - The number of keys for each DB loop.
* `ACTIVE_EXPIRE_CYCLE_FAST_DURATION` = 1000 – The maximum duration of the fast cycle in microseconds.
* `ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC` = 25 – The maximum % of CPU to use during the slow cycle.
* `ACTIVE_EXPIRE_CYCLE_ACCEPTABLE_STALE` = 10 – The maximum % of expired keys to tolerate in memory.

Changing the `active-expire-effort` value results in a lower percentage of expired keys tolerated in memory. However, it will lead to longer cycles and increased CPU usage, which may introduce latency.

To calculate the new values, use the following formulas:

* `ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP + (ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP / 4 * effort)`
* `ACTIVE_EXPIRE_CYCLE_FAST_DURATION + (ACTIVE_EXPIRE_CYCLE_FAST_DURATION / 4 * effort)`
* `ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC + (2 * effort)`
* `ACTIVE_EXPIRE_CYCLE_ACCEPTABLE_STALE - effort`

where `ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP`, `ACTIVE_EXPIRE_CYCLE_FAST_DURATION`, `ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC`, and `ACTIVE_EXPIRE_CYCLE_ACCEPTABLE_STALE` are the base values, and `effort` is calculated as the specified `active-expire-effort` value minus 1.

This means that at any given moment the maximum amount of keys already expired
that are using memory is at max equal to max amount of write operations per
second divided by 4.

## How expires are handled in the replication link and AOF file

Expand Down