forked from letsencrypt/boulder
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add configuration driven Prometheus black box metric exporter
- Loading branch information
1 parent
1e5d89e
commit 97e393d
Showing
19 changed files
with
1,495 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,216 @@ | ||
# boulder-observer | ||
|
||
A modular configuration driven approach to black box monitoring with | ||
Prometheus. | ||
|
||
* [boulder-observer](#boulder-observer) | ||
* [Usage](#usage) | ||
* [Options](#options) | ||
* [Starting the boulder-observer | ||
daemon](#starting-the-boulder-observer-daemon) | ||
* [Configuration](#configuration) | ||
* [Root](#root) | ||
* [Schema](#schema) | ||
* [Example](#example) | ||
* [Monitors](#monitors) | ||
* [Schema](#schema-1) | ||
* [Example](#example-1) | ||
* [Probers](#probers) | ||
* [DNS](#dns) | ||
* [Schema](#schema-2) | ||
* [Example](#example-2) | ||
* [HTTP](#http) | ||
* [Schema](#schema-3) | ||
* [Example](#example-3) | ||
* [Metrics](#metrics) | ||
* [obs_monitors](#obs_monitors) | ||
* [obs_observations](#obs_observations) | ||
* [Development](#development) | ||
* [Starting Prometheus locally](#starting-prometheus-locally) | ||
* [Viewing metrics locally](#viewing-metrics-locally) | ||
|
||
## Usage | ||
|
||
### Options | ||
|
||
```shell | ||
$ ./boulder-observer -help | ||
-config string | ||
Path to boulder-observer configuration file (default "config.yml") | ||
``` | ||
|
||
### Starting the boulder-observer daemon | ||
|
||
```shell | ||
$ ./boulder-observer -config test/config-next/observer.yml | ||
I152525 boulder-observer _KzylQI Versions: main=(Unspecified Unspecified) Golang=(go1.16.2) BuildHost=(Unspecified) | ||
I152525 boulder-observer q_D84gk Initializing boulder-observer daemon from config: test/config-next/observer.yml | ||
I152525 boulder-observer 7aq68AQ all monitors passed validation | ||
I152527 boulder-observer yaefiAw kind=[HTTP] success=[true] duration=[0.130097] name=[https://letsencrypt.org-[200]] | ||
I152527 boulder-observer 65CuDAA kind=[HTTP] success=[true] duration=[0.148633] name=[http://letsencrypt.org/foo-[200 404]] | ||
I152530 boulder-observer idi4rwE kind=[DNS] success=[false] duration=[0.000093] name=[[2606:4700:4700::1111]:53-udp-A-google.com-recurse] | ||
I152530 boulder-observer prOnrw8 kind=[DNS] success=[false] duration=[0.000242] name=[[2606:4700:4700::1111]:53-tcp-A-google.com-recurse] | ||
I152530 boulder-observer 6uXugQw kind=[DNS] success=[true] duration=[0.022962] name=[1.1.1.1:53-udp-A-google.com-recurse] | ||
I152530 boulder-observer to7h-wo kind=[DNS] success=[true] duration=[0.029860] name=[owen.ns.cloudflare.com:53-udp-A-letsencrypt.org-no-recurse] | ||
I152530 boulder-observer ovDorAY kind=[DNS] success=[true] duration=[0.033820] name=[owen.ns.cloudflare.com:53-tcp-A-letsencrypt.org-no-recurse] | ||
... | ||
``` | ||
|
||
## Configuration | ||
|
||
Configuration is provided via a YAML file. | ||
|
||
### Root | ||
|
||
#### Schema | ||
|
||
`debugaddr`: The Prometheus scrape port prefixed with a single colon | ||
(e.g. `:8040`). | ||
|
||
`buckets`: List of floats representing Prometheus histogram buckets (e.g | ||
`[.001, .002, .005, .01, .02, .05, .1, .2, .5, 1, 2, 5, 10]`) | ||
|
||
`syslog`: Map of log levels, see schema below. | ||
|
||
- `stdoutlevel`: Log level for stdout, see legend below. | ||
- `sysloglevel`:Log level for stdout, see legend below. | ||
|
||
`0`: *EMERG* `1`: *ALERT* `2`: *CRIT* `3`: *ERR* `4`: *WARN* `5`: | ||
*NOTICE* `6`: *INFO* `7`: *DEBUG* | ||
|
||
`monitors`: List of monitors, see [monitors](#monitors) for schema. | ||
|
||
#### Example | ||
|
||
```yaml | ||
debugaddr: :8040 | ||
buckets: [.001, .002, .005, .01, .02, .05, .1, .2, .5, 1, 2, 5, 10] | ||
syslog: | ||
stdoutlevel: 6 | ||
sysloglevel: 6 | ||
- | ||
... | ||
``` | ||
### Monitors | ||
#### Schema | ||
`period`: Interval between probing attempts (e.g. `1s` `1m` `1h`). | ||
|
||
`kind`: Kind of prober to use, see [probers](#probers) for schema. | ||
|
||
`settings`: Map of prober settings, see [probers](#probers) for schema. | ||
|
||
#### Example | ||
|
||
```yaml | ||
monitors: | ||
- | ||
period: 5s | ||
kind: DNS | ||
settings: | ||
... | ||
``` | ||
|
||
### Probers | ||
|
||
#### DNS | ||
|
||
##### Schema | ||
|
||
`protocol`: Protocol to use, options are: `udp` or `tcp`. | ||
|
||
`server`: Hostname, IPv4 address, or IPv6 address surrounded with | ||
brackets + port of the DNS server to send the query to (e.g. | ||
`example.com:53`, `1.1.1.1:53`, or `[2606:4700:4700::1111]:53`). | ||
|
||
`recurse`: Bool indicating if recursive resolution is desired. | ||
|
||
`query_name`: Name to query (e.g. `example.com`). | ||
|
||
`query_type`: Record type to query, options are: `A`, `AAAA`, `TXT`, or | ||
`CAA`. | ||
|
||
##### Example | ||
|
||
```yaml | ||
monitors: | ||
- | ||
period: 5s | ||
kind: DNS | ||
settings: | ||
protocol: tcp | ||
server: [2606:4700:4700::1111]:53 | ||
recurse: false | ||
query_name: letsencrypt.org | ||
query_type: A | ||
``` | ||
|
||
#### HTTP | ||
|
||
##### Schema | ||
|
||
`url`: Scheme + Hostname to send a request to (e.g. | ||
`https://example.com`). | ||
|
||
`rcodes`: List of expected HTTP response codes. | ||
|
||
##### Example | ||
|
||
```yaml | ||
monitors: | ||
- | ||
period: 2s | ||
kind: HTTP | ||
settings: | ||
url: http://letsencrypt.org/FOO | ||
rcodes: [200, 404] | ||
``` | ||
|
||
## Metrics | ||
|
||
Observer provides the following metrics. | ||
|
||
### obs_monitors | ||
|
||
Count of configured monitors. | ||
|
||
**Labels:** | ||
|
||
`kind`: Kind of Prober the monitor is configured to use. | ||
|
||
`valid`: Bool indicating whether settings provided could be validated | ||
for the `kind` of Prober specified. | ||
|
||
### obs_observations | ||
|
||
**Labels:** | ||
|
||
`name`: Name of the monitor. | ||
|
||
`kind`: Kind of prober the monitor is configured to use. | ||
|
||
`duration`: Duration of the probing in seconds. | ||
|
||
`success`: Bool indicating whether the result of the probe attempt was | ||
successful. | ||
|
||
**Bucketed response times:** | ||
|
||
This is configurable, see `buckets` under [root/schema](#schema). | ||
|
||
## Development | ||
|
||
### Starting Prometheus locally | ||
|
||
Please note, this assumes you've installed a local Prometheus binary. | ||
|
||
```shell | ||
prometheus --config.file=boulder/test/prometheus/prometheus.yml | ||
``` | ||
|
||
### Viewing metrics locally | ||
|
||
When developing with a local Prometheus instance you can use this link | ||
to view metrics: [link](http://0.0.0.0:9090) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
package main | ||
|
||
import ( | ||
"flag" | ||
"io/ioutil" | ||
|
||
"github.com/letsencrypt/boulder/cmd" | ||
"github.com/letsencrypt/boulder/observer" | ||
"gopkg.in/yaml.v2" | ||
) | ||
|
||
func main() { | ||
configPath := flag.String( | ||
"config", "config.yml", "Path to boulder-observer configuration file") | ||
flag.Parse() | ||
|
||
configYAML, err := ioutil.ReadFile(*configPath) | ||
cmd.FailOnError(err, "failed to read config file") | ||
|
||
// Parse the YAML config file. | ||
var config observer.ObsConf | ||
err = yaml.Unmarshal(configYAML, &config) | ||
if err != nil { | ||
cmd.FailOnError(err, "failed to parse YAML config") | ||
} | ||
|
||
// Make an `Observer` object. | ||
observer, err := config.MakeObserver() | ||
if err != nil { | ||
cmd.FailOnError(err, "config failed validation") | ||
} | ||
|
||
// Start the `Observer` daemon. | ||
observer.Start() | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
package observer | ||
|
||
import ( | ||
"errors" | ||
"strings" | ||
"time" | ||
|
||
"github.com/letsencrypt/boulder/cmd" | ||
"github.com/letsencrypt/boulder/observer/probers" | ||
"gopkg.in/yaml.v2" | ||
) | ||
|
||
// MonConf is exported to receive YAML configuration in `ObsConf`. | ||
type MonConf struct { | ||
Period cmd.ConfigDuration `yaml:"period"` | ||
Kind string `yaml:"kind"` | ||
Settings probers.Settings `yaml:"settings"` | ||
} | ||
|
||
// validatePeriod ensures the received `Period` field is at least 1µs. | ||
func (c *MonConf) validatePeriod() error { | ||
if c.Period.Duration < 1*time.Microsecond { | ||
return errors.New("period must be at least 1µs") | ||
} | ||
return nil | ||
} | ||
|
||
// unmarshalConfigurer constructs a `Configurer` by marshaling the | ||
// value of the `Settings` field back to bytes, then passing it to the | ||
// `UnmarshalSettings` method of the `Configurer` type specified by the | ||
// `Kind` field. | ||
func (c MonConf) unmarshalConfigurer() (probers.Configurer, error) { | ||
kind := strings.Trim(strings.ToLower(c.Kind), " ") | ||
configurer, err := probers.GetConfigurer(kind) | ||
if err != nil { | ||
return nil, err | ||
} | ||
settings, _ := yaml.Marshal(c.Settings) | ||
configurer, err = configurer.UnmarshalSettings(settings) | ||
if err != nil { | ||
return nil, err | ||
} | ||
return configurer, nil | ||
} | ||
|
||
// makeMonitor constructs a `monitor` object from the contents of the | ||
// bound `MonConf`. If the `MonConf` cannot be validated, an error | ||
// appropriate for end-user consumption is returned instead. | ||
func (c MonConf) makeMonitor() (*monitor, error) { | ||
err := c.validatePeriod() | ||
if err != nil { | ||
return nil, err | ||
} | ||
probeConf, err := c.unmarshalConfigurer() | ||
if err != nil { | ||
return nil, err | ||
} | ||
prober, err := probeConf.MakeProber() | ||
if err != nil { | ||
return nil, err | ||
} | ||
return &monitor{c.Period.Duration, prober}, nil | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
package observer | ||
|
||
import ( | ||
"testing" | ||
"time" | ||
|
||
"github.com/letsencrypt/boulder/cmd" | ||
) | ||
|
||
func TestMonConf_validatePeriod(t *testing.T) { | ||
type fields struct { | ||
Period cmd.ConfigDuration | ||
} | ||
tests := []struct { | ||
name string | ||
fields fields | ||
wantErr bool | ||
}{ | ||
{"valid", fields{cmd.ConfigDuration{Duration: 1 * time.Microsecond}}, false}, | ||
{"1 nanosecond", fields{cmd.ConfigDuration{Duration: 1 * time.Nanosecond}}, true}, | ||
{"none supplied", fields{cmd.ConfigDuration{}}, true}, | ||
} | ||
for _, tt := range tests { | ||
t.Run(tt.name, func(t *testing.T) { | ||
c := &MonConf{ | ||
Period: tt.fields.Period, | ||
} | ||
if err := c.validatePeriod(); (err != nil) != tt.wantErr { | ||
t.Errorf("MonConf.validatePeriod() error = %v, wantErr %v", err, tt.wantErr) | ||
} | ||
}) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
package observer | ||
|
||
import ( | ||
"strconv" | ||
"time" | ||
|
||
blog "github.com/letsencrypt/boulder/log" | ||
"github.com/letsencrypt/boulder/observer/probers" | ||
) | ||
|
||
type monitor struct { | ||
period time.Duration | ||
prober probers.Prober | ||
} | ||
|
||
// start spins off a 'Prober' goroutine on an interval of `m.period` | ||
// with a timeout of half `m.period` | ||
func (m monitor) start(logger blog.Logger) { | ||
ticker := time.NewTicker(m.period) | ||
timeout := m.period / 2 | ||
go func() { | ||
for { | ||
select { | ||
case <-ticker.C: | ||
// Attempt to probe the configured target. | ||
success, dur := m.prober.Probe(timeout) | ||
|
||
// Produce metrics to be scraped by Prometheus. | ||
histObservations.WithLabelValues( | ||
m.prober.Name(), m.prober.Kind(), strconv.FormatBool(success), | ||
).Observe(dur.Seconds()) | ||
|
||
// Log the outcome of the probe attempt. | ||
logger.Infof( | ||
"kind=[%s] success=[%v] duration=[%f] name=[%s]", | ||
m.prober.Kind(), success, dur.Seconds(), m.prober.Name()) | ||
} | ||
} | ||
}() | ||
} |
Oops, something went wrong.