Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated cherry pick of #308: Add tagging controller configuration #334: Stop retrying failed workitem after a certain amount of #360

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
5dffef9
Add tagging controller configuration
nguyenkndinh Feb 23, 2022
1b2dce1
add log
nguyenkndinh Feb 23, 2022
2b79d88
rearrange the controllers
nguyenkndinh Feb 23, 2022
851438b
remove debugging log
nguyenkndinh Feb 24, 2022
a2b68a8
removed route controller
nguyenkndinh Feb 24, 2022
93d63f3
added a blank test file for the tagging controller
nguyenkndinh Feb 24, 2022
64e5f6e
remove predefined tag
nguyenkndinh Feb 24, 2022
9adf41d
Refactoring based on recommendations
nguyenkndinh Mar 1, 2022
96a052e
Sticking to the naming convention
nguyenkndinh Mar 1, 2022
a538f54
address more comments on naming
nguyenkndinh Mar 8, 2022
f40691a
Using ListNode to get nodes entering and leaving the cluster.
nguyenkndinh Mar 15, 2022
137b089
refactor
nguyenkndinh Mar 15, 2022
5022b20
Add testing and controller config skeletons
nguyenkndinh Mar 19, 2022
e091cb4
Added tagging and flags mechanisms
nguyenkndinh Mar 22, 2022
e0aa760
Disabled the tagging controller by default
nguyenkndinh Mar 22, 2022
df66d0c
Updated test structure
nguyenkndinh Mar 23, 2022
a8efac1
Making the tests more robust
nguyenkndinh Mar 23, 2022
bbcf7bc
Renaming the maps in tagging controller
nguyenkndinh Mar 23, 2022
614c94d
Refactoring names and remove debugging logs
nguyenkndinh Mar 23, 2022
8ae20ff
Add failure test cases for when EC2 return error
nguyenkndinh Mar 23, 2022
a8322bb
adding details for --resources
nguyenkndinh Mar 24, 2022
51de80b
add in Copyright message
nguyenkndinh Mar 24, 2022
97cca84
Using NodeInformer and Workqueue for tagging resources
nguyenkndinh Mar 25, 2022
c1eccd0
Used workqueue for both tag and untag actions
nguyenkndinh Mar 25, 2022
627b1cd
Update docs/tagging_controller.md
nguyenkndinh Mar 25, 2022
aeebb9f
Renamed fields in the tagging controller to be more user friendly
nguyenkndinh Mar 26, 2022
121829d
Added in a loop to make sure all messages are processed before shutti…
nguyenkndinh Mar 26, 2022
cbfcf9d
Added more logging
nguyenkndinh Mar 26, 2022
75f68d7
Added more testing
nguyenkndinh Mar 26, 2022
78b6a30
cosmetic change
nguyenkndinh Mar 28, 2022
101e3f1
use array instead of map for supported resources
nguyenkndinh Mar 28, 2022
5859690
Reworked the workqueue with workitem
nguyenkndinh Mar 28, 2022
d4a1ce6
Addressed comments
nguyenkndinh Mar 29, 2022
f094e38
addressed verify-lint errors
nguyenkndinh Mar 29, 2022
c669f82
addressed comments and verify-lint
nguyenkndinh Mar 29, 2022
f54a5b2
address validate-lint error
nguyenkndinh Mar 29, 2022
12ac5ee
missed a couple more lint errors
nguyenkndinh Mar 29, 2022
4073667
Updated doc to be clearer
nguyenkndinh Mar 29, 2022
1bf899d
Add TODOs for e2e testing and non-retryable workitem
nguyenkndinh Mar 30, 2022
32d63cc
Stop retrying failed workitem after a certain amount of retries Added…
nguyenkndinh Apr 5, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,9 @@
/aws-cloud-controller-manager
/cloudconfig
_output/
/kops-example
docs/book/_book/
site/
.vscode/
e2e.test
.idea/
16 changes: 16 additions & 0 deletions cmd/aws-cloud-controller-manager/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ package main

import (
"fmt"
"k8s.io/cloud-provider-aws/pkg/controllers/tagging"
"math/rand"
"net"
"net/http"
Expand All @@ -49,6 +50,21 @@ import (
"k8s.io/kubernetes/cmd/cloud-controller-manager/app/options"
"k8s.io/kubernetes/pkg/features" // add the kubernetes feature gates
netutils "k8s.io/utils/net"
"k8s.io/apimachinery/pkg/util/wait"
cloudprovider "k8s.io/cloud-provider"
"k8s.io/cloud-provider-aws/pkg/controllers/tagging"
awsv1 "k8s.io/cloud-provider-aws/pkg/providers/v1"
awsv2 "k8s.io/cloud-provider-aws/pkg/providers/v2"
"k8s.io/cloud-provider/app"
"k8s.io/cloud-provider/options"
cliflag "k8s.io/component-base/cli/flag"
"k8s.io/component-base/logs"
_ "k8s.io/component-base/metrics/prometheus/clientgo" // for client metric registration
_ "k8s.io/component-base/metrics/prometheus/version" // for version metric registration
"k8s.io/klog/v2"
"math/rand"
"os"
"time"

cloudprovider "k8s.io/cloud-provider"
awsv1 "k8s.io/cloud-provider-aws/pkg/providers/v1"
Expand Down
21 changes: 21 additions & 0 deletions docs/TODO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# TODO

### Prereqs

* Document required instance tags (i.e. KubernetesCluster:<cluster-name>)

### Load Balancers

* document all available label/annotations to configure ELBs/NLBs for Service Type=LoadBalancer

### Known Limitations

* Document limitation with hostname / private DNS?

### Kops

* Add a full example (ideally with IAM roles)

### Tagging Controller

* Add e2e testing which enables the controller, and monitors if the resources are tagged properly
7 changes: 7 additions & 0 deletions docs/tagging_controller.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# The Tagging Controller

The tagging controller is responsible for tagging and untagging node resources when they join and leave the cluster, respectively. It can add and remove tags based on user input. Additionally, if a tag is updated, it would leave the updated tag and reapply the user-provided tag. Unlike the existing controllers, the tagging controller works exclusively with AWS. The AWS APIs it uses are `ec2:CreateTags` and `ec2:DeleteTags`.

| Flag | Valid Values | Default | Description |
|------| --- | --- | --- |
| tags | Comma-separated list of key=value | - | A comma-separated list of key-value pairs which will be recorded as nodes' additional tags. For example: "Key1=Val1,Key2=Val2,KeyNoVal1=,KeyNoVal2" |
24 changes: 24 additions & 0 deletions pkg/controllers/options/resources.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
/*
Copyright 2016 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package options

const (
// Instance presenting the string literal "instance"
Instance string = "instance"
)

// SupportedResources contains the resources that can be tagged by the controller at the moment
var SupportedResources = []string{
Instance,
}
53 changes: 53 additions & 0 deletions pkg/controllers/options/tagging_controller.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
/*
Copyright 2016 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package options

import (
"fmt"
"github.com/spf13/pflag"
)

// TaggingControllerOptions contains the inputs that can
// be used in the tagging controller
type TaggingControllerOptions struct {
Tags map[string]string
Resources []string
}

// AddFlags add the additional flags for the controller
func (o *TaggingControllerOptions) AddFlags(fs *pflag.FlagSet) {
fs.StringToStringVar(&o.Tags, "tags", o.Tags, "Tags to apply to AWS resources in the tagging controller, in a form of key=value.")
fs.StringArrayVar(&o.Resources, "resources", o.Resources, "AWS resources name to add/remove tags in the tagging controller.")
}

// Validate checks for errors from user input
func (o *TaggingControllerOptions) Validate() error {
if len(o.Tags) == 0 {
return fmt.Errorf("--tags must not be empty and must be a form of key=value")
}

if len(o.Resources) == 0 {
return fmt.Errorf("--resources must not be empty")
}

for _, r := range o.Resources {
for _, resource := range SupportedResources {
if r != resource {
return fmt.Errorf("%s is not a supported resource. Current supported resources %v", r, SupportedResources)
}
}
}

return nil
}
56 changes: 56 additions & 0 deletions pkg/controllers/tagging/metrics.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
/*
Copyright 2020 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package tagging

import (
"k8s.io/component-base/metrics"
"k8s.io/component-base/metrics/legacyregistry"
"sync"
)

var register sync.Once

var (
workItemDuration = metrics.NewHistogramVec(
&metrics.HistogramOpts{
Name: "cloudprovider_aws_tagging_controller_work_item_duration_seconds",
Help: "workitem latency of workitem being in the queue and time it takes to process",
StabilityLevel: metrics.ALPHA,
},
[]string{"latency_type"})

workItemError = metrics.NewCounterVec(
&metrics.CounterOpts{
Name: "cloudprovider_aws_tagging_controller_work_item_errors_total",
Help: "any error in dequeueing the work queue and processing workItem",
StabilityLevel: metrics.ALPHA,
},
[]string{"error_type", "instance_id"})
)

// registerMetrics registers tagging-controller metrics.
func registerMetrics() {
register.Do(func() {
legacyregistry.MustRegister(workItemDuration)
legacyregistry.MustRegister(workItemError)
})
}

func recordWorkItemLatencyMetrics(latencyType string, timeTaken float64) {
workItemDuration.With(metrics.Labels{"latency_type": latencyType}).Observe(timeTaken)
}

func recordWorkItemErrorMetrics(errorType string, instanceID string) {
workItemError.With(metrics.Labels{"error_type": errorType, "instance_id": instanceID}).Inc()
}
Loading