Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly remove prom metrics #86

Closed
wants to merge 27 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
09b7382
Add config to gitignore
SoulKyu Jul 3, 2024
6e53828
feat(weight-annotation): Annotation to add load-balancing weight
Lowaiz Sep 12, 2022
21a96c7
Set tls1.2 as minimum downstream proto
Aluxima Jun 20, 2022
fcc0193
feat(split-time): splits timeout annotation into 3 different ones
Lowaiz Sep 12, 2022
bc7f82a
feat(split-timeout): Docs
Lowaiz Sep 13, 2022
71a5eb3
feat(split-timeout): Default Timeouts flags and config
Lowaiz Sep 13, 2022
386e28a
feat(split-timeout): fix comment and default TO for cluster
Lowaiz Sep 15, 2022
c087705
Update README.md
Lowaiz Sep 19, 2022
1eff785
feat(split-timeout): Fix tests
Lowaiz Sep 20, 2022
98ea1b7
Use http2 for upstream proxy, add paramters
Aluxima Nov 17, 2022
40155dc
Add support for envoy 1.24
Aluxima Nov 23, 2022
3612b11
Raise default circuit breaker limits
Aluxima Dec 6, 2022
ec3c605
Add configurable listener ALPN protocols to enable downstream http2
Aluxima Dec 16, 2022
e1381f3
Allow matching <host>:*
Aluxima Jan 30, 2023
19a4967
Adds multiple envoy listener ipv4 adresses
MathildeLeroi Apr 5, 2023
13f2f78
replace deprecated ioutils with os.
SoulKyu May 13, 2024
94798ec
Correct deduplication and healthcheck for wildcard
SoulKyu May 13, 2024
e2efcb0
Add annotation on README
SoulKyu May 21, 2024
0f050b0
Ignoring launch.json
SoulKyu May 21, 2024
c63dc04
Add a validation on the subdomain to be sure healthcheck is no't on a…
SoulKyu May 21, 2024
c4efbea
Add yggdrasil.uswitch.com/upstream-http-version annotation
Aluxima Jun 12, 2023
821ce3a
remove healthcheck for bad configure ingress with wildcard (#8)
SoulKyu Jun 19, 2024
e7d67d6
Feat/custom log file (#9)
SoulKyu Jun 19, 2024
e74820f
test: correct address as a list
SoulKyu Jul 4, 2024
e26ab46
Fix tests
Aluxima Jul 4, 2024
65aa824
feat(mode-maintenance): handle mode maintenance for clusters
SoulKyu Jul 29, 2024
8e05749
fix(metrics): be aware when an ingress is deleted and remove it corre…
SoulKyu Sep 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
bin/
command
testing
ca
config
envoy
launch.json
62 changes: 47 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,20 +75,36 @@ Yggdrasil allows for some customisation of the route and cluster config per Ingr
| Name | type |
|--------------------------------------------------------------|----------|
| [yggdrasil.uswitch.com/healthcheck-path](#health-check-path) | string |
| [yggdrasil.uswitch.com/timeout](#timeout) | duration |
| [yggdrasil.uswitch.com/healthcheck-host](#health-check-host) | string |
| [yggdrasil.uswitch.com/timeout](#timeouts) | duration |
| [yggdrasil.uswitch.com/cluster-timeout](#timeouts) | duration |
| [yggdrasil.uswitch.com/route-timeout](#timeouts) | duration |
| [yggdrasil.uswitch.com/per-try-timeout](#timeouts) | duration |
| [yggdrasil.uswitch.com/weight](#weight) | uint32 |
| [yggdrasil.uswitch.com/retry-on](#retries) | string |

### Health Check Path
Specifies a path to configure a [HTTP health check](https://www.envoyproxy.io/docs/envoy/v1.19.0/api-v3/config/core/v3/health_check.proto#config-core-v3-healthcheck-httphealthcheck) to. Envoy will not route to clusters that fail health checks.

### Health Check Host
Permit to change the host of the healthcheck when using wildcard. Example: healthcheck for `*.my-app.example.com` can't work natively, you can configure a specific path with `yggdrasil.uswitch.com/healthcheck-host: health.my-app.example.com`.

* [config.core.v3.HealthCheck.HttpHealthCheck.Path](https://www.envoyproxy.io/docs/envoy/v1.19.0/api-v3/config/core/v3/health_check.proto#envoy-v3-api-field-config-core-v3-healthcheck-httphealthcheck-path)

### Timeout
Allows for adjusting the timeout in envoy. Currently this will set the following timeouts to this value:
### Timeouts
Allows for adjusting the timeout in envoy.

The `yggdrasil.uswitch.com/cluster-timeout` annotation will set the [config.cluster.v3.Cluster.ConnectTimeout](https://www.envoyproxy.io/docs/envoy/v1.19.0/api-v3/config/cluster/v3/cluster.proto#envoy-v3-api-field-config-cluster-v3-cluster-connect-timeout)

The `yggdrasil.uswitch.com/route-timeout` annotation will set the [config.route.v3.RouteAction.Timeout](https://www.envoyproxy.io/docs/envoy/v1.19.0/api-v3/config/route/v3/route_components.proto#envoy-v3-api-field-config-route-v3-routeaction-timeout)

the `yggdrasil.uswitch.com/per-try-timeout` annotation will set the [config.route.v3.RetryPolicy.PerTryTimeout](https://www.envoyproxy.io/docs/envoy/v1.19.0/api-v3/config/route/v3/route_components.proto#envoy-v3-api-field-config-route-v3-retrypolicy-per-try-timeout)

The `yggdrasil.uswitch.com/timeout` annotation will set all of the above with the same value. This annotation has the lowest priority, if set with one of the other TO annotations, the specific one will override the general annotation.

* [config.route.v3.RouteAction.Timeout](https://www.envoyproxy.io/docs/envoy/v1.19.0/api-v3/config/route/v3/route_components.proto#envoy-v3-api-field-config-route-v3-routeaction-timeout)
* [config.route.v3.RetryPolicy.PerTryTimeout](https://www.envoyproxy.io/docs/envoy/v1.19.0/api-v3/config/route/v3/route_components.proto#envoy-v3-api-field-config-route-v3-retrypolicy-per-try-timeout)
* [config.cluster.v3.Cluster.ConnectTimeout](https://www.envoyproxy.io/docs/envoy/v1.19.0/api-v3/config/cluster/v3/cluster.proto#envoy-v3-api-field-config-cluster-v3-cluster-connect-timeout)

### Weight
Allows for adjusting the [load balancer weights](https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/endpoint/v3/endpoint_components.proto#config-endpoint-v3-lbendpoint) in envoy.

### Retries
Allows overwriting the default retry policy's [config.route.v3.RetryPolicy.RetryOn](https://www.envoyproxy.io/docs/envoy/v1.19.0/api-v3/config/route/v3/route_components.proto#envoy-v3-api-field-config-route-v3-retrypolicy-retry-on) set by the `--retry-on` flag (default 5xx). Accepts a comma-separated list of retry-on policies.
Expand All @@ -105,6 +121,7 @@ metadata:
annotations:
yggdrasil.uswitch.com/healthcheck-path: /healthz
yggdrasil.uswitch.com/timeout: 30s
yggdrasil.uswitch.com/weight: "12"
yggdrasil.uswitch.com/retry-on: gateway-error,connect-failure
spec:
rules:
Expand All @@ -130,6 +147,7 @@ Yggdrasil can be configured using a config file e.g:
{
"nodeName": "foo",
"ingressClasses": ["multi-cluster", "multi-cluster-staging"],
"accessLog": "/var/log/envoy/",
"syncSecrets": false,
"certificates": [
{
Expand All @@ -142,7 +160,9 @@ Yggdrasil can be configured using a config file e.g:
{
"token": "xxxxxxxxxxxxxxxx",
"apiServer": "https://cluster1.api.com",
"ca": "pathto/cluster1/ca"
"ca": "pathto/cluster1/ca",
"maintenance": false,
"kubernetesClusterName": "cluster1"
},
{
"tokenPath": "/path/to/a/token",
Expand All @@ -159,18 +179,29 @@ The list of certificates will be loaded by Yggdrasil and served to the Envoy nod
The `ingressClasses` is a list of ingress classes that yggdrasil will watch for.
Each cluster represents a different Kubernetes cluster with the token being a service account token for that cluster. `ca` is the Path to the ca certificate for that cluster.

Maintenance is a new mode that allow to set a cluster in maintenance mode :
- Upstream only in one cluster are keeped
- Upstream in at least 1 cluster that is not in maintenance is deleted for the cluster in maintenance mode
- Yggdrasil will Fatal if all clusters are in maintenance mode.

This is optional and equal to `false` by default.

kubernetesClusterName is the name of the cluster, it's only for information and will be used for metrics. Optional defaults to `""`

## Metrics
Yggdrasil has a number of Go, gRPC, Prometheus, and Yggdrasil-specific metrics built in which can be reached by cURLing the `/metrics` path at the health API address/port (default: 8081). See [Flags](#Flags) for more information on configuring the health API address/port.

The Yggdrasil-specific metrics which are available from the API are:

| Name | Description | Type |
|-----------------------------|------------------------------------------------|----------|
| yggdrasil_cluster_updates | Number of times the clusters have been updated | counter |
| yggdrasil_clusters | Total number of clusters generated | gauge |
| yggdrasil_ingresses | Total number of matching ingress objects | gauge |
| yggdrasil_listener_updates | Number of times the listener has been updated | counter |
| yggdrasil_virtual_hosts | Total number of virtual hosts generated | gauge |
| Name | Description | Type |
|----------------------------------------------|------------------------------------------------|----------|
| yggdrasil_cluster_updates | Number of times the clusters have been updated | counter |
| yggdrasil_clusters | Total number of clusters generated | gauge |
| yggdrasil_ingresses | Total number of matching ingress objects | gauge |
| yggdrasil_listener_updates | Number of times the listener has been updated | counter |
| yggdrasil_virtual_hosts | Total number of virtual hosts generated | gauge |
| yggdrasil_kubernetes_cluster_in_maintenance | Return 1 if cluster in maintenance or 0 | gauge |
| yggdrasil_upstream_info | Provide informations relate to upstream | gauge |

## Flags
```
Expand All @@ -180,7 +211,8 @@ The Yggdrasil-specific metrics which are available from the API are:
--config string config file
--config-dump Enable config dump endpoint at /configdump on the health-address HTTP server
--debug Log at debug level
--envoy-listener-ipv4-address string IPv4 address by the envoy proxy to accept incoming connections (default "0.0.0.0")
--access-log path for the file logs
--envoy-listener-ipv4-address strings IPv4 addresses by the envoy proxy to accept incoming connections (default "0.0.0.0")
--envoy-port uint32 port by the envoy proxy to accept incoming connections (default 10000)
--health-address string yggdrasil health API listen address (default "0.0.0.0:8081")
-h, --help help for yggdrasil
Expand Down
77 changes: 57 additions & 20 deletions cmd/root.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ import (
"context"
"flag"
"fmt"
"io/ioutil"
"os"
"os/signal"
"syscall"
Expand All @@ -24,28 +23,33 @@ import (
)

type clusterConfig struct {
APIServer string `json:"apiServer"`
Ca string `json:"ca"`
Token string `json:"token"`
TokenPath string `json:"tokenPath"`
APIServer string `json:"apiServer"`
Ca string `json:"ca"`
Token string `json:"token"`
TokenPath string `json:"tokenPath"`
Maintenance bool `json:"maintenance"`
KubernetesClusterName string `json:"kubernetesClusterName"`
}

type config struct {
IngressClass string `json:"ingressClass"`
NodeName string `json:"nodeName"`
Clusters []clusterConfig `json:"clusters"`
SyncSecrets bool `json:"syncSecrets"`
AccessLog string `json:"accessLog"`
Certificates []envoy.Certificate `json:"certificates"`
TrustCA string `json:"trustCA"`
UpstreamPort uint32 `json:"upstreamPort"`
EnvoyListenerIpv4Address string `json:"envoyListenerIpv4Address"`
EnvoyListenerIpv4Address []string `json:"envoyListenerIpv4Address"`
EnvoyPort uint32 `json:"envoyPort"`
MaxEjectionPercentage uint32 `json:"maxEjectionPercentage"`
HostSelectionRetryAttempts int64 `json:"hostSelectionRetryAttempts"`
UpstreamHealthCheck envoy.UpstreamHealthCheck `json:"upstreamHealthCheck"`
UseRemoteAddress bool `json:"useRemoteAddress"`
HttpExtAuthz envoy.HttpExtAuthz `json:"httpExtAuthz"`
HttpGrpcLogger envoy.HttpGrpcLogger `json:"httpGrpcLogger"`
DefaultTimeouts envoy.DefaultTimeouts `json:"defaultTimeouts"`
AlpnProtocols []string `json:"alpnProtocols"`
AccessLogger envoy.AccessLogger `json:"accessLogger"`
}

Expand Down Expand Up @@ -78,6 +82,7 @@ func init() {
rootCmd.PersistentFlags().String("address", "0.0.0.0:8080", "yggdrasil envoy control plane listen address")
rootCmd.PersistentFlags().String("health-address", "0.0.0.0:8081", "yggdrasil health API listen address")
rootCmd.PersistentFlags().String("node-name", "", "envoy node name")
rootCmd.PersistentFlags().String("access-log", "/var/log/envoy/", "envoy default access log file")
rootCmd.PersistentFlags().String("cert", "", "certfile")
rootCmd.PersistentFlags().String("key", "", "keyfile")
rootCmd.PersistentFlags().String("ca", "", "trustedCA")
Expand All @@ -86,7 +91,7 @@ func init() {
rootCmd.PersistentFlags().Bool("debug", false, "Log at debug level")
rootCmd.PersistentFlags().Bool("config-dump", false, "Enable config dump endpoint at /configdump on the health-address HTTP server")
rootCmd.PersistentFlags().Uint32("upstream-port", 443, "port used to connect to the upstream ingresses")
rootCmd.PersistentFlags().String("envoy-listener-ipv4-address", "0.0.0.0", "IPv4 address by the envoy proxy to accept incoming connections")
rootCmd.PersistentFlags().StringSlice("envoy-listener-ipv4-address", []string{"0.0.0.0"}, "IPv4 address by the envoy proxy to accept incoming connections")
rootCmd.PersistentFlags().Uint32("envoy-port", 10000, "port by the envoy proxy to accept incoming connections")
rootCmd.PersistentFlags().Int32("max-ejection-percentage", -1, "maximal percentage of hosts ejected via outlier detection. Set to >=0 to activate outlier detection in envoy.")
rootCmd.PersistentFlags().Int64("host-selection-retry-attempts", -1, "Number of host selection retry attempts. Set to value >=0 to enable")
Expand All @@ -109,11 +114,16 @@ func init() {
rootCmd.PersistentFlags().Bool("http-ext-authz-pack-as-bytes", false, "When this field is true, Envoy will send the body as raw bytes.")
rootCmd.PersistentFlags().Bool("http-ext-authz-failure-mode-allow", true, "Changes filters behaviour on errors")

rootCmd.PersistentFlags().Duration("default-route-timeout", 15*time.Second, "Default timeout of the routes")
rootCmd.PersistentFlags().Duration("default-cluster-timeout", 30*time.Second, "Default timeout of the cluster")
rootCmd.PersistentFlags().Duration("default-per-try-timeout", 5*time.Second, "Default timeout of PerTry")
rootCmd.PersistentFlags().StringSlice("alpn-protocols", []string{}, "exposed listener ALPN protocols")
viper.BindPFlag("debug", rootCmd.PersistentFlags().Lookup("debug"))
viper.BindPFlag("configDump", rootCmd.PersistentFlags().Lookup("config-dump"))
viper.BindPFlag("address", rootCmd.PersistentFlags().Lookup("address"))
viper.BindPFlag("healthAddress", rootCmd.PersistentFlags().Lookup("health-address"))
viper.BindPFlag("nodeName", rootCmd.PersistentFlags().Lookup("node-name"))
viper.BindPFlag("accessLog", rootCmd.PersistentFlags().Lookup("access-log"))
viper.BindPFlag("ingressClasses", rootCmd.PersistentFlags().Lookup("ingress-classes"))
viper.BindPFlag("cert", rootCmd.PersistentFlags().Lookup("cert"))
viper.BindPFlag("key", rootCmd.PersistentFlags().Lookup("key"))
Expand Down Expand Up @@ -141,6 +151,10 @@ func init() {
viper.BindPFlag("httpExtAuthz.allowPartialMessage", rootCmd.PersistentFlags().Lookup("http-ext-authz-allow-partial-message"))
viper.BindPFlag("httpExtAuthz.packAsBytes", rootCmd.PersistentFlags().Lookup("http-ext-authz-pack-as-bytes"))
viper.BindPFlag("httpExtAuthz.FailureModeAllow", rootCmd.PersistentFlags().Lookup("http-ext-authz-failure-mode-allow"))
viper.BindPFlag("defaultTimeouts.Route", rootCmd.PersistentFlags().Lookup("default-route-timeout"))
viper.BindPFlag("defaultTimeouts.Cluster", rootCmd.PersistentFlags().Lookup("default-cluster-timeout"))
viper.BindPFlag("defaultTimeouts.PerTry", rootCmd.PersistentFlags().Lookup("default-per-try-timeout"))
viper.BindPFlag("alpnProtocols", rootCmd.PersistentFlags().Lookup("alpn-protocols"))
}

func initConfig() {
Expand Down Expand Up @@ -212,12 +226,12 @@ func main(*cobra.Command, []string) error {
certPath := certificate.Cert
keyPath := certificate.Key

certBytes, err := ioutil.ReadFile(certPath)
certBytes, err := os.ReadFile(certPath)
if err != nil {
log.Fatalf("Failed to read %s: %v", certPath, err)
}

keyBytes, err := ioutil.ReadFile(keyPath)
keyBytes, err := os.ReadFile(keyPath)
if err != nil {
log.Fatalf("Failed to read %s: %v", keyPath, err)
}
Expand All @@ -231,8 +245,9 @@ func main(*cobra.Command, []string) error {
c.Certificates,
viper.GetString("trustCA"),
viper.GetStringSlice("ingressClasses"),
viper.GetString("accessLog"),
envoy.WithUpstreamPort(uint32(viper.GetInt32("upstreamPort"))),
envoy.WithEnvoyListenerIpv4Address(viper.GetString("envoyListenerIpv4Address")),
envoy.WithEnvoyListenerIpv4Address(viper.GetStringSlice("envoyListenerIpv4Address")),
envoy.WithEnvoyPort(uint32(viper.GetInt32("envoyPort"))),
envoy.WithOutlierPercentage(viper.GetInt32("maxEjectionPercentage")),
envoy.WithHostSelectionRetryAttempts(viper.GetInt64("hostSelectionRetryAttempts")),
Expand All @@ -241,10 +256,13 @@ func main(*cobra.Command, []string) error {
envoy.WithHttpExtAuthzCluster(c.HttpExtAuthz),
envoy.WithHttpGrpcLogger(c.HttpGrpcLogger),
envoy.WithSyncSecrets(c.SyncSecrets),
envoy.WithDefaultTimeouts(c.DefaultTimeouts),
envoy.WithDefaultRetryOn(viper.GetString("retryOn")),
envoy.WithAccessLog(c.AccessLogger),
envoy.WithTracingProvider(viper.GetString("tracingProvider")),
envoy.WithAlpnProtocols(viper.GetStringSlice("alpnProtocols")),
)
configurator.ValidateAndFormatPath()
snapshotter := envoy.NewSnapshotter(envoyCache, configurator, aggregator)

go snapshotter.Run(aggregator)
Expand Down Expand Up @@ -276,17 +294,17 @@ func createClientConfig(path string) (*rest.Config, error) {
return clientcmd.BuildConfigFromFlags("", path)
}

func createSources(clusters []clusterConfig) ([]*kubernetes.Clientset, error) {
sources := []*kubernetes.Clientset{}
func createSources(clusters []clusterConfig) ([]k8s.KubernetesConfig, error) {
var sources []k8s.KubernetesConfig
allInMaintenance := true

for _, cluster := range clusters {

var token string

if cluster.TokenPath != "" {
bytes, err := ioutil.ReadFile(cluster.TokenPath)
bytes, err := os.ReadFile(cluster.TokenPath)
if err != nil {
return sources, err
return nil, err
}
token = string(bytes)
} else {
Expand All @@ -302,16 +320,32 @@ func createSources(clusters []clusterConfig) ([]*kubernetes.Clientset, error) {
}
clientSet, err := kubernetes.NewForConfig(config)
if err != nil {
return sources, err
return nil, err
}

kubernetesConfig := k8s.NewKubernetesConfig(cluster.Maintenance, clientSet, cluster.KubernetesClusterName)

envoy.KubernetesClusterInMaintenance.WithLabelValues(cluster.APIServer).Set(float64(0))

if cluster.Maintenance {
envoy.KubernetesClusterInMaintenance.WithLabelValues(cluster.APIServer).Set(float64(1))
log.Warnf("Kubernetes Cluster with API Endpoint %s is in maintenance mode", cluster.APIServer)
} else {
allInMaintenance = false
}
sources = append(sources, clientSet)

sources = append(sources, *kubernetesConfig)
}

if allInMaintenance {
log.Fatal("All clusters are in maintenance mode")
}

return sources, nil
}

func configFromKubeConfig(paths []string) ([]*kubernetes.Clientset, error) {
sources := []*kubernetes.Clientset{}
func configFromKubeConfig(paths []string) ([]k8s.KubernetesConfig, error) {
var sources []k8s.KubernetesConfig

for _, configPath := range paths {
config, err := createClientConfig(configPath)
Expand All @@ -322,7 +356,10 @@ func configFromKubeConfig(paths []string) ([]*kubernetes.Clientset, error) {
if err != nil {
return sources, err
}
sources = append(sources, clientSet)

kubernetesConfig := k8s.NewKubernetesConfig(false, clientSet, "")

sources = append(sources, *kubernetesConfig)
}

return sources, nil
Expand Down
1 change: 1 addition & 0 deletions envoy
Submodule envoy added at db32ac
22 changes: 22 additions & 0 deletions launch.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Launch Package",
"type": "go",
"request": "launch",
"mode": "auto",
"program": "main.go",
"args": ["--config=./config/config.json",
"--upstream-port=443",
"--ca=/etc/ssl/certs/",
"--envoy-port=443",
"--envoy-listener-ipv4-address=127.0.0.1",
"--max-ejection-percentage=100",
"--retry-on", "connect-failure,5xx"]
}
]
}
Loading