-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transparency in node health/status information and state changes #112
Comments
Thanks for taking the time to look into this!
I like your idea of storing health history on the node itself. Exposing more health information in the API is something we thought of as well (#66) but have not yet gotten around to implementing. |
@rodecker I'm looking at adding an API method and route for accessing the latest health report from a node. A few questions on direction:
|
@FliesLikeABrick I agree /1.0/nodes/[node id]/health_report (or /health) would be more intuitive. But no harm in adding both. |
While looking into how node liveness is determined (API's report of alive_ipv4 and alive_ipv6), I found myself wanting to be able to estimate the age of the health data for a system and understand if the current API response is reflective of the system health, or if a change is likely pending in the next 24 hours (next ring-admin run). One or more of the following would be helpful:
health
table? Speaking of, can someone provide the schema for thehealth
table or add it to the SCHEMA in ring-admin?With a bit of support, I can begin work on a PR for one or both of the items listed above, which could then enable research into some other contributions.
Other questions:
alive_v4
andalive_v6
in themachines
table is contained inansible_process()
. What cron job or other trigger results inansible_process()
being called? The closest cronjob I see is forpurge machines
however that appears to be a cleanup rather than callingansible_process()
ansible_process()
is run), that ring-admin will never be able to catch up on subsequent changes in state unless enough machines recoverThe text was updated successfully, but these errors were encountered: