forked from cloudposse/terraform-aws-eks-cluster
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.yaml
377 lines (318 loc) · 20.1 KB
/
README.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
name: terraform-aws-eks-cluster
license: APACHE2
github_repo: cloudposse/terraform-aws-eks-cluster
badges:
- name: Latest Release
image: https://img.shields.io/github/release/cloudposse/terraform-aws-eks-cluster.svg?style=for-the-badge
url: https://github.com/cloudposse/terraform-aws-eks-cluster/releases/latest
- name: "Last Update"
image: https://img.shields.io/github/last-commit/cloudposse/terraform-aws-eks-cluster/main?style=for-the-badge
url: https://github.com/cloudposse/terraform-aws-eks-cluster/commits/main/
- name: Slack Community
image: https://slack.cloudposse.com/for-the-badge.svg
url: https://slack.cloudposse.com
related:
- name: terraform-aws-eks-workers
description: Terraform module to provision an AWS AutoScaling Group, IAM Role, and
Security Group for EKS Workers
url: https://github.com/cloudposse/terraform-aws-eks-workers
- name: terraform-aws-ec2-autoscale-group
description: Terraform module to provision Auto Scaling Group and Launch Template
on AWS
url: https://github.com/cloudposse/terraform-aws-ec2-autoscale-group
- name: terraform-aws-ecs-container-definition
description: Terraform module to generate well-formed JSON documents (container
definitions) that are passed to the aws_ecs_task_definition Terraform resource
url: https://github.com/cloudposse/terraform-aws-ecs-container-definition
- name: terraform-aws-ecs-alb-service-task
description: Terraform module which implements an ECS service which exposes a web
service via ALB
url: https://github.com/cloudposse/terraform-aws-ecs-alb-service-task
- name: terraform-aws-ecs-web-app
description: Terraform module that implements a web app on ECS and supports autoscaling,
CI/CD, monitoring, ALB integration, and much more
url: https://github.com/cloudposse/terraform-aws-ecs-web-app
- name: terraform-aws-ecs-codepipeline
description: Terraform module for CI/CD with AWS Code Pipeline and Code Build for
ECS
url: https://github.com/cloudposse/terraform-aws-ecs-codepipeline
- name: terraform-aws-ecs-cloudwatch-autoscaling
description: Terraform module to autoscale ECS Service based on CloudWatch metrics
url: https://github.com/cloudposse/terraform-aws-ecs-cloudwatch-autoscaling
- name: terraform-aws-ecs-cloudwatch-sns-alarms
description: Terraform module to create CloudWatch Alarms on ECS Service level metrics
url: https://github.com/cloudposse/terraform-aws-ecs-cloudwatch-sns-alarms
- name: terraform-aws-ec2-instance
description: Terraform module for providing a general purpose EC2 instance
url: https://github.com/cloudposse/terraform-aws-ec2-instance
- name: terraform-aws-ec2-instance-group
description: Terraform module for provisioning multiple general purpose EC2 hosts
for stateful applications
url: https://github.com/cloudposse/terraform-aws-ec2-instance-group
description: Terraform module to provision an [EKS](https://aws.amazon.com/eks/) cluster on AWS.
introduction: |-
The module provisions the following resources:
- EKS cluster of master nodes that can be used together with the [terraform-aws-eks-workers](https://github.com/cloudposse/terraform-aws-eks-workers),
[terraform-aws-eks-node-group](https://github.com/cloudposse/terraform-aws-eks-node-group) and
[terraform-aws-eks-fargate-profile](https://github.com/cloudposse/terraform-aws-eks-fargate-profile)
modules to create a full-blown cluster
- IAM Role to allow the cluster to access other AWS services
- Optionally, the module creates and automatically applies an authentication ConfigMap (`aws-auth`) to allow the
worker nodes to join the cluster and to add additional users/roles/accounts. (This option is enabled
by default, but has some caveats noted below. Set `apply_config_map_aws_auth` to `false` to avoid these issues.)
> [!WARNING]
> Release `2.0.0` (previously released as version `0.45.0`) contains some changes that
> could result in your existing EKS cluster being replaced (destroyed and recreated).
> To prevent this, follow the instructions in the [v1 to v2 migration path](./docs/migration-v1-v2.md).
> [!NOTE]
> Every Terraform module that provisions an EKS cluster has faced the challenge that access to the cluster
> is partly controlled by a resource inside the cluster, a ConfigMap called `aws-auth`. You need to be able to access
> the cluster through the Kubernetes API to modify the ConfigMap, because
> [there is no AWS API for it](https://github.com/aws/containers-roadmap/issues/185). This presents
> a problem: how do you authenticate to an API endpoint that you have not yet created?
We use the Terraform Kubernetes provider to access the cluster, and it uses the same underlying library
that `kubectl` uses, so configuration is very similar. However, every kind of configuration we have tried
has failed at some point.
- An authentication token can be retrieved using the `aws_eks_cluster_auth` data source. This works as
long as the token does not expire while Terraform is running, and the token is refreshed during the "plan"
phase before trying to refresh the state, and the token does not expire in the interval between
"plan" and "apply". Unfortunately, failures of all these types have been seen. Nevertheless,
this is the only method that is compatible with Terraform Cloud, so it is the default. It is the only
method we fully support until AWS [provides an API for managing `aws-auth`](https://github.com/aws/containers-roadmap/issues/185).
- After creating the EKS cluster, you can generate a `KUBECONFIG` file that configures access to it.
This works most of the time, but if the file was present and used as part of the configuration to create
the cluster, and then the file gets deleted (as would happen in a CI system like Terraform Cloud), Terraform
would not cause the file to be regenerated in time to use it to refresh Terraform's state and the "plan" phase will fail.
So any `KUBECONFIG` file has to be managed separately.
- An authentication token can be retrieved on demand by using the `exec` feature of the Kubernetes provider
to call `aws eks get-token`. This requires that the `aws` CLI be installed and available to Terraform and that it
has access to sufficient credentials to perform the authentication and is configured to use them. When those
conditions are met, this is the most reliable method, and the one Cloud Posse prefers to use. However, since
it has these requirements that are not always easily met, it is not the default method and it is not
fully supported.
All of the above methods can face additional challenges when using `terraform import` to import
resources into the Terraform state. The `KUBECONFIG` file method is the only sure way to `import` resources, due to
[Terraform limitations](https://github.com/hashicorp/terraform/issues/27934) on providers. You will need to create
the file, of course, but that is easily done with `aws eks update-kubeconfig`. Depending on the situation,
you may also be able to import resources by setting `-var apply_config_map_aws_auth=false` during import.
At the moment, the `exec` option appears to be the most reliable method, so we recommend using it if possible,
but because of the extra requirements it has, we use the data source as the default authentication method.
> [!IMPORTANT]
> All of the above methods require network connectivity between the host running the
> `terraform` command and the EKS endpoint. If your EKS cluster does not have public access enabled, this means
> you need to take extra steps, such as using a VPN to provide access to the private endpoint, or running
> `terraform` on a host in the same VPC as the EKS cluster.
> [!WARNING]
> ### Failure during `destroy`
>
> If the cluster is destroyed (via Terraform or otherwise) before the Terraform resource
> responsible for the `aws-auth` ConfigMap is destroyed, Terraform will get stuck trying to delete the ConfigMap,
> because it cannot contact the now destroyed cluster. This can show up as a `connection refused` error (usually
> to `https://localhost/`). The easiest ways to handle this is either to add `-var apply_config_map_aws_auth=false`
> to the `destroy` command or to remove the ConfigMap (`...kubernetes_config_map.aws_auth[0]`) from the Terraform
> state with `terraform state rm`.
> [!NOTE]
> We give you the `kubernetes_config_map_ignore_role_changes` option and default it to `true` for the following reasons:
> - We provision the EKS cluster
> - Then we wait for the cluster to become available (see `null_resource.wait_for_cluster` in [auth.tf](auth.tf)
> - Then we provision the Kubernetes Auth ConfigMap to map and add additional roles/users/accounts to Kubernetes groups
> - That is all we do in this module, but after that, we expect you to use [terraform-aws-eks-node-group](https://github.com/cloudposse/terraform-aws-eks-node-group)
> to provision a managed Node Group
> - Then EKS updates the Auth ConfigMap and adds worker roles to it (for the worker nodes to join the cluster)
> - Since the ConfigMap is modified outside of Terraform state, Terraform wants to update it to to remove the worker roles EKS added
> - If you update the ConfigMap without including the worker nodes that EKS added, you will disconnect them from the cluster
However, it is possible to get the worker node roles from the terraform-aws-eks-node-group via Terraform "remote state"
and include them with any other roles you want to add (example code to be published later), so we make
ignoring the role changes optional. (This is what we do for Cloud Posse clients.)
If you do not ignore changes then you will have no problem with making future intentional changes.
The downside of having `kubernetes_config_map_ignore_role_changes` set to true is that if you later want to make changes,
such as adding other IAM roles to Kubernetes groups, you cannot do so via Terraform, because the role changes are ignored.
Because of Terraform restrictions, you cannot simply change `kubernetes_config_map_ignore_role_changes` from `true`
to `false`, apply changes, and set it back to `true` again. Terraform does not allow the
"ignore" settings to be changed on a resource, so `kubernetes_config_map_ignore_role_changes` is implemented as
2 different resources, one with ignore settings and one without. If you want to switch from ignoring to not ignoring,
or vice versa, you must manually move the `aws_auth` resource in the terraform state. Change the setting of
`kubernetes_config_map_ignore_role_changes`, run `terraform plan`, and you will see that an `aws_auth` resource
is planned to be destroyed and another one is planned to be created. Use `terraform state mv` to move the destroyed
resource to the created resource "address", something like
```
terraform state mv 'module.eks_cluster.kubernetes_config_map.aws_auth_ignore_changes[0]' 'module.eks_cluster.kubernetes_config_map.aws_auth[0]'
```
Then run `terraform plan` again and you should see only your desired changes made "in place". After applying your
changes, if you want to set `kubernetes_config_map_ignore_role_changes` back to `true`, you will again need to use
`terraform state mv` to move the `auth-map` back to its old "address".
usage: |2-
For a complete example, see [examples/complete](examples/complete).
For automated tests of the complete example using [bats](https://github.com/bats-core/bats-core) and [Terratest](https://github.com/gruntwork-io/terratest) (which tests and deploys the example on AWS), see [test](test).
Other examples:
- [terraform-aws-components/eks/cluster](https://github.com/cloudposse/terraform-aws-components/tree/master/modules/eks/cluster) - Cloud Posse's service catalog of "root module" invocations for provisioning reference architectures
```hcl
provider "aws" {
region = var.region
}
module "label" {
source = "cloudposse/label/null"
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
namespace = var.namespace
name = var.name
stage = var.stage
delimiter = var.delimiter
attributes = ["cluster"]
tags = var.tags
}
locals {
# Prior to Kubernetes 1.19, the usage of the specific kubernetes.io/cluster/* resource tags below are required
# for EKS and Kubernetes to discover and manage networking resources
# https://www.terraform.io/docs/providers/aws/guides/eks-getting-started.html#base-vpc-networking
tags = { "kubernetes.io/cluster/${module.label.id}" = "shared" }
}
module "vpc" {
source = "cloudposse/vpc/aws"
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
cidr_block = "172.16.0.0/16"
tags = local.tags
context = module.label.context
}
module "subnets" {
source = "cloudposse/dynamic-subnets/aws"
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
availability_zones = var.availability_zones
vpc_id = module.vpc.vpc_id
igw_id = module.vpc.igw_id
cidr_block = module.vpc.vpc_cidr_block
nat_gateway_enabled = true
nat_instance_enabled = false
tags = local.tags
context = module.label.context
}
module "eks_node_group" {
source = "cloudposse/eks-node-group/aws"
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
instance_types = [var.instance_type]
subnet_ids = module.subnets.public_subnet_ids
health_check_type = var.health_check_type
min_size = var.min_size
max_size = var.max_size
cluster_name = module.eks_cluster.eks_cluster_id
# Enable the Kubernetes cluster auto-scaler to find the auto-scaling group
cluster_autoscaler_enabled = var.autoscaling_policies_enabled
context = module.label.context
# Ensure the cluster is fully created before trying to add the node group
module_depends_on = module.eks_cluster.kubernetes_config_map_id
}
module "eks_cluster" {
source = "cloudposse/eks-cluster/aws"
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
vpc_id = module.vpc.vpc_id
subnet_ids = module.subnets.public_subnet_ids
kubernetes_version = var.kubernetes_version
oidc_provider_enabled = true
addons = [
// https://docs.aws.amazon.com/eks/latest/userguide/managing-vpc-cni.html#vpc-cni-latest-available-version
{
addon_name = "vpc-cni"
addon_version = var.vpc_cni_version
resolve_conflicts_on_create = "NONE"
resolve_conflicts_on_update = "NONE"
service_account_role_arn = null
},
// https://docs.aws.amazon.com/eks/latest/userguide/managing-kube-proxy.html
{
addon_name = "kube-proxy"
addon_version = var.kube_proxy_version
resolve_conflicts_on_create = "NONE"
resolve_conflicts_on_update = "NONE"
service_account_role_arn = null
},
// https://docs.aws.amazon.com/eks/latest/userguide/managing-coredns.html
{
addon_name = "coredns"
addon_version = var.coredns_version
resolve_conflicts_on_create = "NONE"
resolve_conflicts_on_update = "NONE"
service_account_role_arn = null
},
]
addons_depends_on = [module.eks_node_group]
context = module.label.context
cluster_depends_on = [module.subnets]
}
```
Module usage with two unmanaged worker groups:
```hcl
locals {
# Unfortunately, the `aws_ami` data source attribute `most_recent` (https://github.com/cloudposse/terraform-aws-eks-workers/blob/34a43c25624a6efb3ba5d2770a601d7cb3c0d391/main.tf#L141)
# does not work as you might expect. If you are not going to use a custom AMI you should
# use the `eks_worker_ami_name_filter` variable to set the right kubernetes version for EKS workers,
# otherwise the first version of Kubernetes supported by AWS (v1.11) for EKS workers will be selected, but
# EKS control plane will ignore it to use one that matches the version specified by the `kubernetes_version` variable.
eks_worker_ami_name_filter = "amazon-eks-node-${var.kubernetes_version}*"
}
module "eks_workers" {
source = "cloudposse/eks-workers/aws"
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
attributes = ["small"]
instance_type = "t3.small"
eks_worker_ami_name_filter = local.eks_worker_ami_name_filter
vpc_id = module.vpc.vpc_id
subnet_ids = module.subnets.public_subnet_ids
health_check_type = var.health_check_type
min_size = var.min_size
max_size = var.max_size
wait_for_capacity_timeout = var.wait_for_capacity_timeout
cluster_name = module.label.id
cluster_endpoint = module.eks_cluster.eks_cluster_endpoint
cluster_certificate_authority_data = module.eks_cluster.eks_cluster_certificate_authority_data
cluster_security_group_id = module.eks_cluster.eks_cluster_managed_security_group_id
# Auto-scaling policies and CloudWatch metric alarms
autoscaling_policies_enabled = var.autoscaling_policies_enabled
cpu_utilization_high_threshold_percent = var.cpu_utilization_high_threshold_percent
cpu_utilization_low_threshold_percent = var.cpu_utilization_low_threshold_percent
context = module.label.context
}
module "eks_workers_2" {
source = "cloudposse/eks-workers/aws"
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
attributes = ["medium"]
instance_type = "t3.medium"
eks_worker_ami_name_filter = local.eks_worker_ami_name_filter
vpc_id = module.vpc.vpc_id
subnet_ids = module.subnets.public_subnet_ids
health_check_type = var.health_check_type
min_size = var.min_size
max_size = var.max_size
wait_for_capacity_timeout = var.wait_for_capacity_timeout
cluster_name = module.label.id
cluster_endpoint = module.eks_cluster.eks_cluster_endpoint
cluster_certificate_authority_data = module.eks_cluster.eks_cluster_certificate_authority_data
cluster_security_group_id = module.eks_cluster.eks_cluster_managed_security_group_id
# Auto-scaling policies and CloudWatch metric alarms
autoscaling_policies_enabled = var.autoscaling_policies_enabled
cpu_utilization_high_threshold_percent = var.cpu_utilization_high_threshold_percent
cpu_utilization_low_threshold_percent = var.cpu_utilization_low_threshold_percent
context = module.label.context
}
module "eks_cluster" {
source = "cloudposse/eks-cluster/aws"
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
vpc_id = module.vpc.vpc_id
subnet_ids = module.subnets.public_subnet_ids
kubernetes_version = var.kubernetes_version
oidc_provider_enabled = false
workers_role_arns = [module.eks_workers.workers_role_arn, module.eks_workers_2.workers_role_arn]
allowed_security_group_ids = [module.eks_workers.security_group_id, module.eks_workers_2.security_group_id]
context = module.label.context
}
```
include:
- docs/targets.md
- docs/terraform.md
contributors: []