Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need readiness check endpoint for PD #5658

Open
hanlins opened this issue Nov 1, 2022 · 6 comments · May be fixed by #5685
Open

Need readiness check endpoint for PD #5658

hanlins opened this issue Nov 1, 2022 · 6 comments · May be fixed by #5685
Labels
type/feature-request Categorizes issue or PR as related to a new feature.

Comments

@hanlins
Copy link

hanlins commented Nov 1, 2022

Feature Request

Describe your feature request related problem

When doing a rolling update for a PD cluster, we need to turn down a PD instance, then bring up a PD instance with the updated config, and repeat this process for all PD instances one at a time. Typically we use tidb-operator for automating this, and a PD instance is considered "Running" as long as the pod is up for some duration. However, it could be the PD instance is still picking up with the leader and receiving etcd raft logs, and cannot serve the request at the moment. If let's say we have a deployment of PDs that has 3 replicas, the first two pods are updated but not ready for serving new requests, then at this point the PD is unavailable since the remaining PD cannot write quorum (as the other two are still syncing etcd).

Describe the feature you'd like

It would be helpful if PD instance can expose info (maybe via a restful endpoint) whether its internal data is synced with leader and ready for serving new requests.

Describe alternatives you've considered

An alternative approach in tidb-operator scenario is to add a init container in PD's pod spec that sleeps for certain time. Before sleep is done, the Pod will stuck in Init state so the rolling update process will wait til the sleep completes. Hopefully, during this time, the etcd data for that PD instance is synced.

Teachability, Documentation, Adoption, Migration Strategy

The endpoint should be something new, and since old system is depending on it, we don't need to worry about the migration.

@hanlins hanlins added the type/feature-request Categorizes issue or PR as related to a new feature. label Nov 1, 2022
@hanlins
Copy link
Author

hanlins commented Nov 7, 2022

Proposing to add a restful endpoint which is similar to the health endpoint to expose its etcd readiness information

healthPrefix = "pd/api/v1/health"

The logic for readiness check would be checking if the current PD's corresponding etcd member is still a learner. If an etcd member is synced with the leader, then it should have been promoted as a voting member, i.e. a follower1. In the etcd cluster status response, the IS LEARNER info is exposed2 so I suppose it's feasible to do so.

cc @nolouch

Footnotes

  1. https://etcd.io/docs/v3.3/learning/learner/

  2. https://etcd.io/docs/v3.5/tutorials/how-to-check-cluster-status/

@hanlins hanlins linked a pull request Nov 8, 2022 that will close this issue
@rleungx
Copy link
Member

rleungx commented Nov 8, 2022

IMO, when we do a rolling update, there also could be a log lag between followers and the leader, which cannot be solved by only checking the learner.

@hanlins
Copy link
Author

hanlins commented Nov 8, 2022

IMO, when we do a rolling update, there also could be a log lag between followers and the leader, which cannot be solved by only checking the learner.

Sure, thanks for pointing that out! Any suggestions on what should we check?

@rleungx
Copy link
Member

rleungx commented Nov 8, 2022

Can we just use health API?

@hanlins
Copy link
Author

hanlins commented Nov 8, 2022

Can we just use health API?

Was checking CheckHealth, it seems to be checking all members rather than checking whether the current PD instance is ready or not?

pd/server/cluster/cluster.go

Lines 2407 to 2432 in 91f1664

// CheckHealth checks if members are healthy.
func CheckHealth(client *http.Client, members []*pdpb.Member) map[uint64]*pdpb.Member {
healthMembers := make(map[uint64]*pdpb.Member)
for _, member := range members {
for _, cURL := range member.ClientUrls {
ctx, cancel := context.WithTimeout(context.Background(), clientTimeout)
req, err := http.NewRequestWithContext(ctx, http.MethodGet, fmt.Sprintf("%s%s", cURL, healthURL), nil)
if err != nil {
log.Error("failed to new request", errs.ZapError(errs.ErrNewHTTPRequest, err))
cancel()
continue
}
resp, err := client.Do(req)
if resp != nil {
resp.Body.Close()
}
cancel()
if err == nil && resp.StatusCode == http.StatusOK {
healthMembers[member.GetMemberId()] = member
break
}
}
}
return healthMembers
}

I think in k8s rolling upgrade, we would be interested in whether the current PD instance has been keeping up with the leader. What about also checking the applied index for current PD against the leader's committed index?

@hanlins
Copy link
Author

hanlins commented Nov 9, 2022

Also for the health endpoint, it seems it's returning 200 status all the time

h.rd.JSON(w, http.StatusOK, healths)

In k8s we typically need the endpoint to return a status code larger than 400 if a status check probe fails1, quote:

Any code greater than or equal to 200 and less than 400 indicates success. Any other code indicates failure.

Footnotes

  1. https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-liveness-http-request

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature-request Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants