Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increased replication factor for recent blocks in store-gateway #4703

Closed
colega opened this issue Apr 11, 2023 · 4 comments
Closed

Increased replication factor for recent blocks in store-gateway #4703

colega opened this issue Apr 11, 2023 · 4 comments

Comments

@colega
Copy link
Contributor

colega commented Apr 11, 2023

Is your feature request related to a problem? Please describe.

Store-gateway uses a standard replication factor for all blocks, usually 3, which means that every block is replicated exactly 3 times (once per zone, with zone-aware deployment) and during query time one of the three will be queried.

For deployments with large retention periods, we can have quite a lot of store-gateways, say 150 in a namespace, however most of the queries still land in a recent time range, which are held by a limited amount of store-gateways, but we need to scale them all in order to handle the request rate of just the few ones that are queried more often.

Let's make some maths, assuming 150 store-gateways, 50 in each zone.

  • If last day is queried, that is only one of 50 store-gateways queried, while the rest are sitting with same requests and doing nothing.
  • If 8 split-and-merge compactor shards are configured, then 8 from each 50 are queried, assuming a perfect distribution.
  • If last 7 days are queried then it's 7/50=15% of our store-gateways, or 56 (all) if 8 split-and-merge compactor shards are configured, assuming a perfect distribution.

Some quick queries show that around 95% of all queries we receive are within the last 1d range.

Describe the solution you'd like

Increase the replication factor of the recent blocks, for a configurable recent definition.

As a simple implementation I'd say we can just multiply the usual replication factor, for a default of 1 day recent period. But I'd go further.

The split-and-merge compactor also helps distributing load across store-gateways, and maybe it would make sense to ensure that every store gateway can respond for the most common type of query: the last day?

I propose to adjust the replication factor for that last day when the split-and-merge compactor shards number is adjusted for the tenant (and make that replication factor depend on the tenant ID). In that case, the replication factor should be count(all store gateways) / split-and-merge-compactor shards, i.e. in our example of 150 store-gateways and 8 block shards we would have a replication factor of 18, with each block shard replicated 6 times in each zone.

Note that we're talking about just 1 day, and multiplying one day by 6 means just 1.5% more among the 395 days of our default cloud retention. I think this increase is neglible compared to the benefits of distributing 90% of our queries across all store-gateways.

@bboreham
Copy link
Contributor

I could also believe that queries further back than 1 day tend to span multiple days.
The main case where this proposal doesn't help is someone doing intensive queries on one particular day in the past, right?
(For which, dynamically splitting shards according to load might be an idea)

@colega
Copy link
Contributor Author

colega commented Apr 11, 2023

The main case where this proposal doesn't help is someone doing intensive queries on one particular day in the past, right?

Right, but as I pointed in the maths above, a 7 days query with a decent amount of block shards already cover all store-gateways, or at least many of them. And given those are <5% of queries, I wouldn't focus on optimizing them right now.

@jesusvazquez
Copy link
Member

The split-and-merge compactor also helps distributing load across store-gateways, and maybe it would make sense to ensure that every store gateway can respond for the most common type of query: the last day?

This makes sense to me.

It also does take special importance during rollouts when for only those gateways replicating 24 hour blocks we lose 1/3 of query capacity for 95% of the queries as opposed to only losing 1/n where n is all the store gateways replicating those blocks.

@colega
Copy link
Contributor Author

colega commented Feb 12, 2025

This has been implemented in #9944 (thanks @56quarters 👏)

@colega colega closed this as completed Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants