Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase replication factor for some blocks on store-gateways #9944

Closed
56quarters opened this issue Nov 18, 2024 · 0 comments · Fixed by #10382
Closed

Increase replication factor for some blocks on store-gateways #9944

56quarters opened this issue Nov 18, 2024 · 0 comments · Fixed by #10382
Assignees
Labels

Comments

@56quarters
Copy link
Contributor

Each block owned by store-gateways is replicated to three store-gateways. When there are many queries that touch a particular block this can result in unbalanced CPU usage between store-gateways leading to higher costs. By far, recent blocks are queried more than older blocks.

From an internal cluster, we see that most queries only touch the most recent data:

  • ~92% of Select() calls that hit store-gateways touch data from the last 25h
  • ~50% of Select() calls that hit store-gateways touch data from the last 73h
  • Less than 1% of Select() calls that hit store-gateways touch data older than 28d ago
  • Less than 0.1% of Select() calls that hit store-gateways touch data older than 30d ago

In order to support spreading load for more recent blocks to more store-gateways, we should introduce the ability to override the configured replication factor (three by default) to something higher. The mechanism for picking overridden replication factor may be configurable or may be based on a variety of factors.

This issue proposed to add the ability to override the replication factor and default behavior or doing this based on the age or duration of blocks and iterating on the exact behavior in further PRs.

copied from an internal issue and discussion

56quarters added a commit to grafana/dskit that referenced this issue Nov 18, 2024
This change adds a new method that accepts 0 or more `Option` instances
that modify the behavior of the call. These options can (currently) be
used to adjust the replication factor for a particular key or use buffers
to avoid excessive allocation.

Part of grafana/mimir#9944
56quarters added a commit to grafana/dskit that referenced this issue Jan 3, 2025
This change adds a new method that accepts 0 or more `Option` instances
that modify the behavior of the call. These options can (currently) be
used to adjust the replication factor for a particular key or use buffers
to avoid excessive allocation.

The most notable changes are in the `Ring.findInstancesForKey` method
which is the core of the `Ring.Get` method. Instead of keeping track
of distinct zones and assuming that only a single instance per zone
would ever be returned, we keep a map of the number of instances
found in each zone.

Part of grafana/mimir#9944
56quarters added a commit to grafana/dskit that referenced this issue Jan 3, 2025
This change adds a new method that accepts 0 or more `Option` instances
that modify the behavior of the call. These options can (currently) be
used to adjust the replication factor for a particular key or use buffers
to avoid excessive allocation.

The most notable changes are in the `Ring.findInstancesForKey` method
which is the core of the `Ring.Get` method. Instead of keeping track
of distinct zones and assuming that only a single instance per zone
would ever be returned, we keep a map of the number of instances
found in each zone.

Part of grafana/mimir#9944
56quarters added a commit to grafana/dskit that referenced this issue Jan 3, 2025
This change adds a new method that accepts 0 or more `Option` instances
that modify the behavior of the call. These options can (currently) be
used to adjust the replication factor for a particular key or use buffers
to avoid excessive allocation.

The most notable changes are in the `Ring.findInstancesForKey` method
which is the core of the `Ring.Get` method. Instead of keeping track
of distinct zones and assuming that only a single instance per zone
would ever be returned, we keep a map of the number of instances
found in each zone.

Part of grafana/mimir#9944
56quarters added a commit to grafana/dskit that referenced this issue Jan 6, 2025
This change adds a new method that accepts 0 or more `Option` instances
that modify the behavior of the call. These options can (currently) be
used to adjust the replication factor for a particular key or use buffers
to avoid excessive allocation.

The most notable changes are in the `Ring.findInstancesForKey` method
which is the core of the `Ring.Get` method. Instead of keeping track
of distinct zones and assuming that only a single instance per zone
would ever be returned, we keep a map of the number of instances
found in each zone.

Part of grafana/mimir#9944
56quarters added a commit to grafana/dskit that referenced this issue Jan 13, 2025
This change adds a new method that accepts 0 or more `Option` instances
that modify the behavior of the call. These options can (currently) be
used to adjust the replication factor for a particular key or use buffers
to avoid excessive allocation.

The most notable changes are in the `Ring.findInstancesForKey` method
which is the core of the `Ring.Get` method. Instead of keeping track
of distinct zones and assuming that only a single instance per zone
would ever be returned, we keep a map of the number of instances
found in each zone.

Part of grafana/mimir#9944
56quarters added a commit to grafana/dskit that referenced this issue Jan 28, 2025
* ring: add GetWithOptions method to adjust per call behavior

This change adds a new method that accepts 0 or more `Option` instances
that modify the behavior of the call. These options can (currently) be
used to adjust the replication factor for a particular key or use buffers
to avoid excessive allocation.

The most notable changes are in the `Ring.findInstancesForKey` method
which is the core of the `Ring.Get` method. Instead of keeping track
of distinct zones and assuming that only a single instance per zone
would ever be returned, we keep a map of the number of instances
found in each zone.

Part of grafana/mimir#9944

* Speed up ring key ownership tests

Signed-off-by: Nick Pillitteri <[email protected]>

* Code review changes

Signed-off-by: Nick Pillitteri <[email protected]>

* Expanded test cases for GetWithOptions

Signed-off-by: Nick Pillitteri <[email protected]>

* Explicitly use configured replication factor for number of zones

This change uses the configured replication factor for the ring to determine
the number of zones that should exist when zone-aware replication is enabled

Signed-off-by: Nick Pillitteri <[email protected]>

* Changelog

Signed-off-by: Nick Pillitteri <[email protected]>

---------

Signed-off-by: Nick Pillitteri <[email protected]>
56quarters added a commit that referenced this issue Feb 12, 2025
Instead of replicating blocks to all store-gateways when they are eligible
for dynamic replication, adjust the replication factor by a multiple of the
default. This reduces disk and memory requirements for large tenants.

Related #10382
Related #9944

Signed-off-by: Nick Pillitteri <[email protected]>
56quarters added a commit that referenced this issue Feb 12, 2025
Instead of replicating blocks to all store-gateways when they are eligible
for dynamic replication, adjust the replication factor by a multiple of the
default. This reduces disk and memory requirements for large tenants.

Related #10382
Related #9944

Signed-off-by: Nick Pillitteri <[email protected]>
56quarters added a commit that referenced this issue Feb 13, 2025
…10637)

Instead of replicating blocks to all store-gateways when they are eligible
for dynamic replication, adjust the replication factor by a multiple of the
default. This reduces disk and memory requirements for large tenants.

Related #10382
Related #9944

Signed-off-by: Nick Pillitteri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants