From ff37a37a13a7c845e48a96a17cab68956aad8bc3 Mon Sep 17 00:00:00 2001 From: poyzannur Date: Fri, 17 Jan 2025 15:02:15 +0000 Subject: [PATCH 01/11] add new tables --- docs/sources/setup/size/_index.md | 192 ++++++++++++++++++++++++++---- 1 file changed, 169 insertions(+), 23 deletions(-) diff --git a/docs/sources/setup/size/_index.md b/docs/sources/setup/size/_index.md index 162748eb9e3b8..a4cd036e87616 100644 --- a/docs/sources/setup/size/_index.md +++ b/docs/sources/setup/size/_index.md @@ -22,29 +22,175 @@ This tool helps to generate a Helm Charts `values.yaml` file based on specified [scalable]({{< relref "../../get-started/deployment-modes#simple-scalable" >}}) deployment. The storage needs to be configured after generation.
- - - -
- - GB/day -
- -
- - days -
- -
- - -
+ + +
+ + + +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ComponentCPU RequetMemory RequestBase Replicas
Ingester48150
Distributor21100
Index gateway1420
Querier1.53250
Query-frontend1416
Query-scheduler20.52
Compactor6201 (Singleton)
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ComponentCPU RequetMemory RequestBase Replicas
Ingester2690
Distributor2140
Index gateway0.5410
Querier1.52100
Query-frontend128
Query-scheduler10.52
Compactor6201 (Singleton)
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ComponentCPU RequetMemory RequestBase Replicas
Ingester246
Distributor20.54
Index gateway0.524
Querier1110
Query-frontend122
Query-scheduler10.52
Compactor2101 (Singleton)
+
From c4558f8374678030b62adea5b40d633323dc02d4 Mon Sep 17 00:00:00 2001 From: poyzannur Date: Fri, 17 Jan 2025 16:41:56 +0000 Subject: [PATCH 02/11] use tabs instead --- docs/sources/setup/size/_index.md | 257 +++++------------------------- 1 file changed, 39 insertions(+), 218 deletions(-) diff --git a/docs/sources/setup/size/_index.md b/docs/sources/setup/size/_index.md index a4cd036e87616..ee175f8803694 100644 --- a/docs/sources/setup/size/_index.md +++ b/docs/sources/setup/size/_index.md @@ -25,210 +25,53 @@ This tool helps to generate a Helm Charts `values.yaml` file based on specified -
- - - -
-
-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ComponentCPU RequetMemory RequestBase Replicas
Ingester48150
Distributor21100
Index gateway1420
Querier1.53250
Query-frontend1416
Query-scheduler20.52
Compactor6201 (Singleton)
-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ComponentCPU RequetMemory RequestBase Replicas
Ingester2690
Distributor2140
Index gateway0.5410
Querier1.52100
Query-frontend128
Query-scheduler10.52
Compactor6201 (Singleton)
-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ComponentCPU RequetMemory RequestBase Replicas
Ingester246
Distributor20.54
Index gateway0.524
Querier1110
Query-frontend122
Query-scheduler10.52
Compactor2101 (Singleton)
-
+
+
-
- - - - - - - - - - - - - - - -
Read ReplicasWrite ReplicasNodesCoresMemory
{{ clusterSize.TotalReadReplicas }}{{ clusterSize.TotalWriteReplicas }}{{ clusterSize.TotalNodes}}{{ clusterSize.TotalCoresRequest}}{{ clusterSize.TotalMemoryRequest}} GB
-
+{{< tabs >}} +{{< tab-content name="~1PB/month (30TB/day)" >}} +| Component | CPU Request | Memory Request | Base Replicas | +|------------------|-------------|----------------|----------------| +| Ingester | 4 | 8 | 150 | +| Distributor | 2 | 1 | 100 | +| Index gateway | 1 | 4 | 20 | +| Querier | 1.5 | 3 | 250 | +| Query-frontend | 1 | 4 | 16 | +| Query-scheduler | 2 | 0.5 | 2 | +| Compactor | 6 | 20 | 1 (Singleton) | +{{< /tab-content >}} +{{< tab-content name="100TB to 1PB /month (3-30TB/day)" >}} +| Component | CPU Request | Memory Request | Base Replicas | +|------------------|-------------|----------------|----------------| +| Ingester | 2 | 6 | 90 | +| Distributor | 2 | 1 | 40 | +| Index gateway | 0.5 | 4 | 10 | +| Querier | 1.5 | 2 | 100 | +| Query-frontend | 1 | 2 | 8 | +| Query-scheduler | 1 | 0.5 | 2 | +| Compactor | 6 | 20 | 1 (Singleton) | +{{< /tab-content >}} +{{< tab-content name="Less than 100TB/month (3TB/day)" >}} +| Component | CPU Request | Memory Request | Base Replicas | +|------------------|-------------|----------------|----------------| +| Ingester | 2 | 4 | 6 | +| Distributor | 2 | 0.5 | 4 | +| Index gateway | 0.5 | 2 | 4 | +| Querier | 1 | 1 | 10 | +| Query-frontend | 1 | 2 | 2 | +| Query-scheduler | 1 | 0.5 | 2 | +| Compactor | 2 | 10 | 1 (Singleton) | +{{< /tab-content >}} +{{< /tabs >}} - Generate and download values file
- - Defines the log volume in gigabytes, ie 1e+9 bytes, expected to be ingested each day. - Defines the node type of the Kubernetes cluster. Is a vendor or type missing? If so, add it to pkg/sizing/node.go. - - Defines how long the ingested logs should be kept. - - - Defines the expected query performance. Basic is sized for a max query throughput of around 3GB/s. Super aims for 25% more throughput. -
-
+ @@ -264,11 +99,7 @@ createApp({ return { nodes: ["Loading..."], node: "Loading...", - bytesDayIngest: null, - retention: null, - queryperf: 'Basic', help: null, - clusterSize: null } }, @@ -305,20 +136,10 @@ createApp({ const url = `${API_URL}/nodes` this.nodes = await (await fetch(url,{mode: 'cors'})).json() }, - async calculateClusterSize() { - if (this.node == 'Loading...' || this.bytesDayIngest== null || this.retention == null) { - return - } - const url = `${API_URL}/cluster?${this.queryString}` - this.clusterSize = await (await fetch(url,{mode: 'cors'})).json() - } }, watch: { node: 'calculateClusterSize', - bytesDayIngest: 'calculateClusterSize', - retention: 'calculateClusterSize', - queryperf: 'calculateClusterSize' } }).mount('#app') From c24071e17c892bda90f2324bb9dbaa531f759519 Mon Sep 17 00:00:00 2001 From: poyzannur Date: Mon, 20 Jan 2025 16:10:48 +0000 Subject: [PATCH 03/11] Add summary and general notes --- docs/sources/setup/size/_index.md | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/docs/sources/setup/size/_index.md b/docs/sources/setup/size/_index.md index ee175f8803694..34ecf64d5a7b8 100644 --- a/docs/sources/setup/size/_index.md +++ b/docs/sources/setup/size/_index.md @@ -17,10 +17,18 @@ weight: 100 -This tool helps to generate a Helm Charts `values.yaml` file based on specified - expected ingestion, retention rate and node type. It will always configure a - [scalable]({{< relref "../../get-started/deployment-modes#simple-scalable" >}}) deployment. The storage needs to be configured after generation. +This section is a guide to size base resource needs of a Loki cluster. +Based on the expected ingestion volume, Loki clusters can be categorised into three tiers. Recommendations below are based on p90 resource utilisations of the relevant components. Each tab represents a different tier. +Please use this document as a rough guide for base resource needs between these tiers. This is only documented for microservices/distributed mode. + +General notes on Query Performance: +- Query resource needs can greatly vary with usage patterns. The rule of thumb is to run as small and many queriers as possible. Unoptimised queries can easily require 10x of the suggested querier resources below in all tiers. Running horizontal autoscaling will be most cost effective solution to meet the demand. +- Parallel-querier and related components can be sized the same along with queriers for starters, depending on how much Loki rules are used. +- Large Loki clusters benefits from disk based caching solution, memcached-extstore. Please see a detailed [blog post](https://grafana.com/blog/2023/08/23/how-we-scaled-grafana-cloud-logs-memcached-cluster-to-50tb-and-improved-reliability/) and read more on [memcached/nvm-caching here](https://memcached.org/blog/nvm-caching/). +- If you’re running a cluster that handles less than 30TB/day (~1PB/month) ingestion, we do not recommend configuring memcached-extstore. The additional operational complexity does not justify the savings. + +These are the node types we suggest from various cloud providers. Please see the relevant specs on the provider documents.
From 38124568e1445263643f5e2218d076ffe6270b38 Mon Sep 17 00:00:00 2001 From: poyzannur Date: Mon, 20 Jan 2025 16:11:43 +0000 Subject: [PATCH 05/11] add total numbers and adjust order --- docs/sources/setup/size/_index.md | 58 +++++++++++++++---------------- 1 file changed, 29 insertions(+), 29 deletions(-) diff --git a/docs/sources/setup/size/_index.md b/docs/sources/setup/size/_index.md index ca9035d3a26d7..68f2cbfac6b8b 100644 --- a/docs/sources/setup/size/_index.md +++ b/docs/sources/setup/size/_index.md @@ -37,38 +37,38 @@ These are the node types we suggest from various cloud providers. Please see the
{{< tabs >}} -{{< tab-content name="~1PB/month (30TB/day)" >}} -| Component | CPU Request | Memory Request | Base Replicas | -|------------------|-------------|----------------|----------------| -| Ingester | 4 | 8 | 150 | -| Distributor | 2 | 1 | 100 | -| Index gateway | 1 | 4 | 20 | -| Querier | 1.5 | 3 | 250 | -| Query-frontend | 1 | 4 | 16 | -| Query-scheduler | 2 | 0.5 | 2 | -| Compactor | 6 | 20 | 1 (Singleton) | +{{< tab-content name="Less than 100TB/month (3TB/day)" >}} +| Component | CPU Request | Memory Request (Gi)| Base Replicas | Total CPU Req |Total Mem Req (Gi)| +|------------------|-------------|-------------------|----------------|----------------|-----------------| +| Ingester | 2 | 4 | 6 | 12 | 36 | +| Distributor | 2 | 0.5 | 4 | 8 | 2 | +| Index gateway | 0.5 | 2 | 4 | 2 | 8 | +| Querier | 1 | 1 | 10 | 10 | 10 | +| Query-frontend | 1 | 2 | 2 | 2 | 4 | +| Query-scheduler | 1 | 0.5 | 2 | 2 | 1 | +| Compactor | 2 | 10 | 1 (Singleton) | 2 | 10 | {{< /tab-content >}} {{< tab-content name="100TB to 1PB /month (3-30TB/day)" >}} -| Component | CPU Request | Memory Request | Base Replicas | -|------------------|-------------|----------------|----------------| -| Ingester | 2 | 6 | 90 | -| Distributor | 2 | 1 | 40 | -| Index gateway | 0.5 | 4 | 10 | -| Querier | 1.5 | 2 | 100 | -| Query-frontend | 1 | 2 | 8 | -| Query-scheduler | 1 | 0.5 | 2 | -| Compactor | 6 | 20 | 1 (Singleton) | +| Component | CPU Request | Memory Request (Gi)| Base Replicas | Total CPU Req |Total Mem Req (Gi)| +|------------------|-------------|-------------------|----------------|----------------|-----------------| +| Ingester | 2 | 6 | 90 | 180 | 540 | +| Distributor | 2 | 1 | 40 | 80 | 40 | +| Index gateway | 0.5 | 4 | 10 | 5 | 40 | +| Querier | 1.5 | 2 | 100 | 150 | 200 | +| Query-frontend | 1 | 2 | 8 | 8 | 16 | +| Query-scheduler | 1 | 0.5 | 2 | 2 | 1 | +| Compactor | 6 | 20 | 1 (Singleton) | 6 | 20 | {{< /tab-content >}} -{{< tab-content name="Less than 100TB/month (3TB/day)" >}} -| Component | CPU Request | Memory Request | Base Replicas | -|------------------|-------------|----------------|----------------| -| Ingester | 2 | 4 | 6 | -| Distributor | 2 | 0.5 | 4 | -| Index gateway | 0.5 | 2 | 4 | -| Querier | 1 | 1 | 10 | -| Query-frontend | 1 | 2 | 2 | -| Query-scheduler | 1 | 0.5 | 2 | -| Compactor | 2 | 10 | 1 (Singleton) | +{{< tab-content name="~1PB/month (30TB/day)" >}} +| Component | CPU Request | Memory Request (Gi)| Base Replicas | Total CPU Req |Total Mem Req (Gi)| +|------------------|-------------|-------------------|----------------|----------------|-----------------| +| Ingester | 4 | 8 | 150 | 600 | 1200 | +| Distributor | 2 | 1 | 100 | 200 | 100 | +| Index gateway | 1 | 4 | 20 | 20 | 80 | +| Querier | 1.5 | 3 | 250 | 375 | 750 | +| Query-frontend | 1 | 4 | 16 | 16 | 64 | +| Query-scheduler | 2 | 0.5 | 2 | 4 | 1 | +| Compactor | 6 | 40 | 1 (Singleton) | 6 | 40 | {{< /tab-content >}} {{< /tabs >}} From c3d679e473e89e70f79bd7ff7a1cff24896055f6 Mon Sep 17 00:00:00 2001 From: poyzannur Date: Mon, 20 Jan 2025 16:28:18 +0000 Subject: [PATCH 06/11] improve text --- docs/sources/setup/size/_index.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/sources/setup/size/_index.md b/docs/sources/setup/size/_index.md index 68f2cbfac6b8b..7b4b5d57c5557 100644 --- a/docs/sources/setup/size/_index.md +++ b/docs/sources/setup/size/_index.md @@ -20,10 +20,11 @@ weight: 100 This section is a guide to size base resource needs of a Loki cluster. Based on the expected ingestion volume, Loki clusters can be categorised into three tiers. Recommendations below are based on p90 resource utilisations of the relevant components. Each tab represents a different tier. -Please use this document as a rough guide for base resource needs between these tiers. This is only documented for microservices/distributed mode. +Please use this document as a rough guide to specify CPU and Memory requests in your deployment. This is only documented for microservices/distributed mode at this time. -General notes on Query Performance: -- Query resource needs can greatly vary with usage patterns. The rule of thumb is to run as small and many queriers as possible. Unoptimised queries can easily require 10x of the suggested querier resources below in all tiers. Running horizontal autoscaling will be most cost effective solution to meet the demand. +Query resource needs can greatly vary with usage patterns and correct configurations. General notes on Query Performance: +- The rule of thumb is to run as small and many queriers as possible. Unoptimised queries can easily require 10x of the suggested querier resources below in all tiers. Running horizontal autoscaling will be most cost effective solution to meet the demand. +- Use this [blog post](https://grafana.com/blog/2023/12/28/the-concise-guide-to-loki-how-to-get-the-most-out-of-your-query-performance/) to adopt best practices for optimised query performance. - Parallel-querier and related components can be sized the same along with queriers for starters, depending on how much Loki rules are used. - Large Loki clusters benefits from disk based caching solution, memcached-extstore. Please see a detailed [blog post](https://grafana.com/blog/2023/08/23/how-we-scaled-grafana-cloud-logs-memcached-cluster-to-50tb-and-improved-reliability/) and read more on [memcached/nvm-caching here](https://memcached.org/blog/nvm-caching/). - If you’re running a cluster that handles less than 30TB/day (~1PB/month) ingestion, we do not recommend configuring memcached-extstore. The additional operational complexity does not justify the savings. From 995f7a4b3209fff6420305395a61d6c3cfe0a95b Mon Sep 17 00:00:00 2001 From: poyzannur Date: Wed, 22 Jan 2025 11:58:03 +0000 Subject: [PATCH 07/11] fix grammer --- docs/sources/setup/size/_index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sources/setup/size/_index.md b/docs/sources/setup/size/_index.md index 7b4b5d57c5557..0d1d174f8da13 100644 --- a/docs/sources/setup/size/_index.md +++ b/docs/sources/setup/size/_index.md @@ -23,7 +23,7 @@ Based on the expected ingestion volume, Loki clusters can be categorised into th Please use this document as a rough guide to specify CPU and Memory requests in your deployment. This is only documented for microservices/distributed mode at this time. Query resource needs can greatly vary with usage patterns and correct configurations. General notes on Query Performance: -- The rule of thumb is to run as small and many queriers as possible. Unoptimised queries can easily require 10x of the suggested querier resources below in all tiers. Running horizontal autoscaling will be most cost effective solution to meet the demand. +- The rule of thumb is to run as small and as many queriers as possible. Unoptimised queries can easily require 10x of the suggested querier resources below in all tiers. Running horizontal autoscaling will be most cost effective solution to meet the demand. - Use this [blog post](https://grafana.com/blog/2023/12/28/the-concise-guide-to-loki-how-to-get-the-most-out-of-your-query-performance/) to adopt best practices for optimised query performance. - Parallel-querier and related components can be sized the same along with queriers for starters, depending on how much Loki rules are used. - Large Loki clusters benefits from disk based caching solution, memcached-extstore. Please see a detailed [blog post](https://grafana.com/blog/2023/08/23/how-we-scaled-grafana-cloud-logs-memcached-cluster-to-50tb-and-improved-reliability/) and read more on [memcached/nvm-caching here](https://memcached.org/blog/nvm-caching/). From 04192aafcebc9dfd55038df27beb80411d0fff71 Mon Sep 17 00:00:00 2001 From: poyzannur Date: Wed, 22 Jan 2025 13:44:39 +0000 Subject: [PATCH 08/11] add link --- docs/sources/setup/size/_index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sources/setup/size/_index.md b/docs/sources/setup/size/_index.md index 0d1d174f8da13..808969efdbd59 100644 --- a/docs/sources/setup/size/_index.md +++ b/docs/sources/setup/size/_index.md @@ -20,7 +20,7 @@ weight: 100 This section is a guide to size base resource needs of a Loki cluster. Based on the expected ingestion volume, Loki clusters can be categorised into three tiers. Recommendations below are based on p90 resource utilisations of the relevant components. Each tab represents a different tier. -Please use this document as a rough guide to specify CPU and Memory requests in your deployment. This is only documented for microservices/distributed mode at this time. +Please use this document as a rough guide to specify CPU and Memory requests in your deployment. This is only documented for [microservices/distributed](https://grafana.com/docs/loki/latest/get-started/deployment-modes/#microservices-mode) mode at this time. Query resource needs can greatly vary with usage patterns and correct configurations. General notes on Query Performance: - The rule of thumb is to run as small and as many queriers as possible. Unoptimised queries can easily require 10x of the suggested querier resources below in all tiers. Running horizontal autoscaling will be most cost effective solution to meet the demand. From df9cf45abfeea01ed5f22480c210fb60e0df3aec Mon Sep 17 00:00:00 2001 From: poyzannur Date: Wed, 22 Jan 2025 13:45:33 +0000 Subject: [PATCH 09/11] remove no list --- docs/sources/setup/size/_index.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/docs/sources/setup/size/_index.md b/docs/sources/setup/size/_index.md index 808969efdbd59..0b105d54f06df 100644 --- a/docs/sources/setup/size/_index.md +++ b/docs/sources/setup/size/_index.md @@ -1,7 +1,4 @@ --- -_build: - list: false -noindex: true title: Size the cluster menuTitle: Size the cluster description: Provides a tool that generates a Helm Chart values.yaml file based on expected ingestion, retention rate, and node type, to help size your Grafana deployment. From 744a3da7716c26ccaec78a4e3dd847ae5e0cc989 Mon Sep 17 00:00:00 2001 From: J Stickler Date: Wed, 22 Jan 2025 14:45:00 -0500 Subject: [PATCH 10/11] Apply suggestions from code review Signed-off-by: J Stickler --- docs/sources/setup/size/_index.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/sources/setup/size/_index.md b/docs/sources/setup/size/_index.md index 0b105d54f06df..8b343b21413ec 100644 --- a/docs/sources/setup/size/_index.md +++ b/docs/sources/setup/size/_index.md @@ -22,11 +22,11 @@ Please use this document as a rough guide to specify CPU and Memory requests in Query resource needs can greatly vary with usage patterns and correct configurations. General notes on Query Performance: - The rule of thumb is to run as small and as many queriers as possible. Unoptimised queries can easily require 10x of the suggested querier resources below in all tiers. Running horizontal autoscaling will be most cost effective solution to meet the demand. - Use this [blog post](https://grafana.com/blog/2023/12/28/the-concise-guide-to-loki-how-to-get-the-most-out-of-your-query-performance/) to adopt best practices for optimised query performance. -- Parallel-querier and related components can be sized the same along with queriers for starters, depending on how much Loki rules are used. -- Large Loki clusters benefits from disk based caching solution, memcached-extstore. Please see a detailed [blog post](https://grafana.com/blog/2023/08/23/how-we-scaled-grafana-cloud-logs-memcached-cluster-to-50tb-and-improved-reliability/) and read more on [memcached/nvm-caching here](https://memcached.org/blog/nvm-caching/). +- Parallel-querier and related components can be sized the same along with queriers to start, depending on how much Loki rules are used. +- Large Loki clusters benefit from a disk based caching solution, memcached-extstore. Please see the detailed [blog post](https://grafana.com/blog/2023/08/23/how-we-scaled-grafana-cloud-logs-memcached-cluster-to-50tb-and-improved-reliability/) and read more about [memcached/nvm-caching here](https://memcached.org/blog/nvm-caching/). - If you’re running a cluster that handles less than 30TB/day (~1PB/month) ingestion, we do not recommend configuring memcached-extstore. The additional operational complexity does not justify the savings. -These are the node types we suggest from various cloud providers. Please see the relevant specs on the provider documents. +These are the node types we suggest from various cloud providers. Please see the relevant specifications in your provider documentation.