metric: add database and table name to handle_query_duration #38261

noucas · 2022-09-29T14:14:31Z

What problem does this PR solve?

Issue Number: close #37892

Problem Summary:

What is changed and how it works?

Add additional information about database name and table to the current metric tidb_server_handle_query_duration_seconds

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

ti-chi-bot · 2022-09-29T14:14:33Z

[REVIEW NOTIFICATION]

This pull request has been approved by:

dveeden

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

sre-bot · 2022-09-29T14:14:40Z

All committers have signed the CLA.

ti-chi-bot · 2022-09-29T14:14:41Z

Welcome @noucas!

It looks like this is your first PR to pingcap/tidb 🎉.

I'm the bot to help you request reviewers, add labels and more, See available commands.

We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to pingcap/tidb. 😃

dveeden · 2022-09-30T08:45:27Z

/cc @Defined2014 @dveeden

dveeden · 2022-09-30T08:45:49Z

/check-issue-triage-complete

dveeden · 2022-09-30T08:47:18Z

@noucas Could you sign the CLA? Could you also give some details (output, screenshots, etc.) about the testing that you did?

dveeden · 2022-09-30T09:09:06Z

dveeden · 2022-09-30T09:13:59Z

Would this impact any of the grafana dashboards? Would it make sense to update the grafana dashboards after this has been merged?

dveeden · 2022-09-30T09:14:50Z

Would the increase in load/data be an issue for prometheus?

dveeden · 2022-09-30T10:09:40Z

server/conn.go

@@ -133,20 +133,6 @@ var (
 		mysql.ComSetOption:        metrics.QueryTotalCounter.WithLabelValues("SetOption", "Error"),
 	}

-	queryDurationHistogramUse      = metrics.QueryDurationHistogram.WithLabelValues("Use")


Is the removal of these lines intentional?

Hi, I'm not active in this repo for a while, so I should miss some newer info for TiDB, but I can give some old info and background about this...

Prometheus's WithLabelValues is a slow operation, we can observe some cost in go-pprof when running the lightweight operation benchmark like "sysbench oltp read only", so it's the reason why it is pre-defined some const label value as a global variable here.

But in this PR, it wants to let the table name be a label, so it's impossible to pre-define as previously, letting the table name be a label is useful for maintaining, but it should observe some performance degradation in the benchmark like sysbench.

At last, it should take care of the cluster with huge numbers of tables, it's not a good practice to store the huge number of label values in Prometheus. (it can be ignored if TiDB has to be changed to some other better ts storage..)

CAUTION: Remember that every unique combination of key-value label pairs represents a new time series, which can dramatically increase the amount of data stored. Do not use labels to store dimensions with high cardinalities (many different label values), such as user IDs, email addresses, or other unbounded sets of values.

from https://prometheus.io/docs/practices/naming/#labels

@dveeden Yes, it is intentional. The reason to remove them is that they are fixed code with only 1 label of sqlType by default and it's okay because we only have a limited set of values for sqlType so hard code will work. But when we increase the labels with db and table, we have no way to hard coded these labels, it should be specified dynamically.

@lysu Thanks for your comment so I understand why there so much hard code like this.

Is it a good idea that we define a config flag to enable/disable the metric? So user will have the right to choose between them like what we can do with record-db-qps config?

In case of we can not add the label table due to the high cardinalities problem, Is it okay that we still add label db?

Having many tables is something that may happen with something like "<application...>" tables where there is one database, but each customer has its own set of tables with a certain prefix. This can be a wordpress hosting solution or something similar. The same can happen if instead of a table prefix a database is created for each customer. However with say 5 tables per customer and 2000 customers you may end up with 5×2000=10000 tables or 2000 databases. So doing this per database would improve the situation.

@lysu Thanks for your comment so I understand why there so much hard code like this.

Is it a good idea that we define a config flag to enable/disable the metric? So user will have the right to choose between them like what we can do with record-db-qps config?

In case of we can not add the label table due to the high cardinalities problem, Is it okay that we still add label db?

Adding a configuration for it like record-db-qps looks good to me.

@lysu Is it okay that we define a new config flag and if it is turn off, the metrics of handle_query_duration_seconds does not appear, and when it is turn on the metrics register with 3 labels {sqlType, dbName, tableName} ?

It is better to keep the original metrics as they used to be. There're many scripts of Grafana depending on it. It is better to add new metrics for the new feature. And the new config flag control the new metrics.

dveeden · 2022-09-30T10:11:18Z

cc @lysu

noucas · 2022-10-04T15:00:55Z

@bb7133 PTAL

hawkingrei · 2022-10-17T10:57:44Z

/run-mysql-test

hawkingrei · 2022-10-17T11:10:29Z

/run-build

hawkingrei · 2022-10-17T11:36:59Z

@dveeden It is all green.

bb7133 · 2022-11-01T15:34:25Z

PTAL @jackysp

close #48480

close #48291

…48510)

…48511)

close #48505

…uration_seconds-metric

ti-chi-bot · 2023-11-11T11:24:46Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dveeden

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

dveeden · 2023-11-12T10:19:05Z

/ok-to-test

dveeden · 2023-11-12T10:21:01Z

@noucas Looks like you tried to update/rebase this PR, but it seems like that didn't go as planned?

dveeden · 2023-11-12T10:23:09Z

/hold

noucas · 2023-11-14T14:44:44Z

There is a merged PR that already added a new label to the metric tidb_server_handle_query_duration_seconds, we can close this PR.
@dveeden Thanks for your kind support to me from my reported issue to this PR. I am happy that my work can finally contribute a little to TiDB community and will keep doing in the future.

ti-chi-bot added do-not-merge/invalid-title release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/needs-triage-completed labels Sep 29, 2022

ti-chi-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Sep 29, 2022

noucas changed the title ~~Add database name and table name to handle_query_duration metric~~ metric: ddd database name and table name to handle_query_duration Sep 29, 2022

ti-chi-bot removed the do-not-merge/invalid-title label Sep 29, 2022

noucas changed the title ~~metric: ddd database name and table name to handle_query_duration~~ metric: add database name and table name to handle_query_duration Sep 29, 2022

noucas changed the title ~~metric: add database name and table name to handle_query_duration~~ metric: add database and table name to handle_query_duration Sep 29, 2022

ti-chi-bot requested review from Defined2014 and dveeden September 30, 2022 08:45

ti-chi-bot removed the do-not-merge/needs-triage-completed label Sep 30, 2022

dveeden reviewed Sep 30, 2022

View reviewed changes

lysu requested a review from bb7133 September 30, 2022 13:16

noucas requested a review from a team as a code owner December 12, 2022 10:07

ti-chi-bot removed the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 12, 2022

lance6716 and others added 11 commits November 11, 2023 17:48

util: add recover wrapper with error group (#48479)

a9ba56d

close #48480

metrics: add server status count (#48292)

d71aa41

close #48291

tests: update mysql-tester commit (#48462)

8755dce

*: upgrade bazel_gazelle (#48503)

3168960

build(deps): bump golang.org/x/time from 0.3.0 to 0.4.0 (#48512)

74f985b

build(deps): bump golang.org/x/mod from 0.13.0 to 0.14.0 (#48509)

2559df3

build(deps): bump github.com/prometheus/common from 0.44.0 to 0.45.0 (#…

2734795

…48510)

build(deps): bump golang.org/x/term from 0.13.0 to 0.14.0 (#48508)

2b44d45

build(deps): bump github.com/prometheus/procfs from 0.11.1 to 0.12.0 (#…

65e4ede

…48511)

handle: use logutil to unify the log category (#48520)

c2ed5f4

ebs br: control the snapshots batch size for fsr enable/disable (#48506)

1dfd2fa

close #48505

ti-chi-bot bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. component/dumpling This is related to Dumpling of TiDB. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 11, 2023

Merge branch 'master' into add-database-and-table-name-handle_query_d…

3cfc3a3

…uration_seconds-metric

ti-chi-bot bot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. approved and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Nov 11, 2023

noucas added 2 commits November 11, 2023 18:23

Rollback

a4416aa

Remove LblTable

42524f4

ti-chi-bot bot removed the approved label Nov 11, 2023

ti-chi-bot bot added the ok-to-test Indicates a PR is ready to be tested. label Nov 12, 2023

ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 12, 2023

dveeden closed this Nov 15, 2023

noucas deleted the add-database-and-table-name-handle_query_duration_seconds-metric branch November 16, 2023 08:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

metric: add database and table name to handle_query_duration #38261

metric: add database and table name to handle_query_duration #38261

noucas commented Sep 29, 2022

ti-chi-bot commented Sep 29, 2022 •

edited

Loading

sre-bot commented Sep 29, 2022 •

edited

Loading

ti-chi-bot commented Sep 29, 2022

dveeden commented Sep 30, 2022

dveeden commented Sep 30, 2022

dveeden commented Sep 30, 2022

dveeden commented Sep 30, 2022

dveeden commented Sep 30, 2022

dveeden commented Sep 30, 2022

dveeden Sep 30, 2022

lysu Sep 30, 2022

noucas Oct 4, 2022

noucas Oct 4, 2022

dveeden Oct 5, 2022

jackysp Nov 3, 2022

noucas Nov 30, 2022

jackysp Jan 5, 2023

dveeden commented Sep 30, 2022

noucas commented Oct 4, 2022 •

edited

Loading

hawkingrei commented Oct 17, 2022

hawkingrei commented Oct 17, 2022

hawkingrei commented Oct 17, 2022

bb7133 commented Nov 1, 2022

ti-chi-bot bot commented Nov 11, 2023

dveeden commented Nov 12, 2023

dveeden commented Nov 12, 2023

dveeden commented Nov 12, 2023

noucas commented Nov 14, 2023

metric: add database and table name to handle_query_duration #38261

metric: add database and table name to handle_query_duration #38261

Conversation

noucas commented Sep 29, 2022

What problem does this PR solve?

What is changed and how it works?

Check List

Release note

ti-chi-bot commented Sep 29, 2022 • edited Loading

sre-bot commented Sep 29, 2022 • edited Loading

ti-chi-bot commented Sep 29, 2022

dveeden commented Sep 30, 2022

dveeden commented Sep 30, 2022

dveeden commented Sep 30, 2022

dveeden commented Sep 30, 2022

dveeden commented Sep 30, 2022

dveeden commented Sep 30, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dveeden commented Sep 30, 2022

noucas commented Oct 4, 2022 • edited Loading

hawkingrei commented Oct 17, 2022

hawkingrei commented Oct 17, 2022

hawkingrei commented Oct 17, 2022

bb7133 commented Nov 1, 2022

ti-chi-bot bot commented Nov 11, 2023

dveeden commented Nov 12, 2023

dveeden commented Nov 12, 2023

dveeden commented Nov 12, 2023

noucas commented Nov 14, 2023

ti-chi-bot commented Sep 29, 2022 •

edited

Loading

sre-bot commented Sep 29, 2022 •

edited

Loading

noucas commented Oct 4, 2022 •

edited

Loading