You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For now, during the TiDB boot process, the observability infrastructures are not started, which makes it pretty hard for us to investigate any issue before starting the server. I think the following ones are especially important and helpful for debug:
/debug/pprof, especially /debug/pprof/profile and /debug/pprof/goroutine. If the TiDB is blocked in the boot stage, get the backtrace of all goroutines and help us to understand where it is blocked.
/metrics. Without it, the prometheus cannot scrape the metrics of the booting TiDB server.
SIGUSR1 signal handler. After starting the server, the TiDB server will handle SIGUSR1 signal by printing the backtrace of all goroutines. Without it, all other signals which will print the backtrace like SIGQUIT will kill the process.
Current behavior
All of them are started after creating storage, dom, server, etc.
The signal handler is setup in signal.SetupSignalHandler. The status server is setup in svr.Run(dom).
Change
To overcome this inconvenience, I propose to make the following changes:
Create a temporary status server including metrics and /debug/pprof before svr.Run(dom), and stop it before creating a new fully functional status server.
Setup the signal handler of SIGUSR1 earlier.
The text was updated successfully, but these errors were encountered:
Enhancement
For now, during the TiDB boot process, the observability infrastructures are not started, which makes it pretty hard for us to investigate any issue before starting the server. I think the following ones are especially important and helpful for debug:
/debug/pprof
, especially/debug/pprof/profile
and/debug/pprof/goroutine
. If the TiDB is blocked in the boot stage, get the backtrace of all goroutines and help us to understand where it is blocked./metrics
. Without it, the prometheus cannot scrape the metrics of the booting TiDB server.SIGUSR1
signal handler. After starting the server, the TiDB server will handleSIGUSR1
signal by printing the backtrace of all goroutines. Without it, all other signals which will print the backtrace likeSIGQUIT
will kill the process.Current behavior
All of them are started after creating
storage
,dom
,server
, etc.The signal handler is setup in
signal.SetupSignalHandler
. The status server is setup insvr.Run(dom)
.Change
To overcome this inconvenience, I propose to make the following changes:
/debug/pprof
beforesvr.Run(dom)
, and stop it before creating a new fully functional status server.SIGUSR1
earlier.The text was updated successfully, but these errors were encountered: