title | summary | category |
---|---|---|
Deploy TiDB, a distributed MySQL compatible database, on Kubernetes via AWS EKS |
Tutorial for deploying TiDB on Kubernetes via AWS EKS. |
operations |
This tutorial is designed to be run locally with tools like AWS Command Line Interface and Terraform. The Terraform code will ultilize a relatively new AWS service called Amazon Elastic Container Service for Kubernetes (Amazon EKS).
This guide is for running Tidb cluster in a testing Kubernetes environment. The following steps will use certain unsecure configurations. Do not just follow this guide only to build your production db environment.
It takes you through these steps:
- Launching a new 3-node Kubernetes cluster via Terraform code provisioning with AWS EKS (optional)
- Installing the Helm package manager for Kubernetes
- Deploying the TiDB Operator
- Deploying your first TiDB cluster
- Connecting to TiDB
- Shutting down down the Kubernetes cluster
Warning: Following this guide will create objects in your AWS account that will cost you money against your AWS bill.
AWS EKS provides managed Kubernetes master nodes
- There's no master nodes to manage
- The master nodes are multi-AZ to provide redundancy
- The master nodes will scale automatically when necessary
Before continuing, make sure you have create a new user (other than the
root user of your AWS account) in IAM, giving it enough permissions.
For simplicity you can just assign AdministratorAccess
to the group this user
belongs to. With more detailed permissions, you will have to be sure you also have
AmazonEKSClusterPolicy
and AmazonEKSServicePolicy
for this user.
Then generate a pair of access keys and keep them safe locally. Now we can continue about using Terraform to provision a Kubernetes on AWS.
Information about using Terraform with EKS can be found here. However if this is the first time using Terraform, please go take a glance at their tutorial here. Firstly making sure your AWS access key pairs and your local Terraform works with the most simplest infrastructure provision (e.g this example) before you continue.
We will use Terraform templates to deploy EKS. Please install terraform using the steps described in the terraform manual. For example, on MacOS or Linux:
# For mac
wget https://releases.hashicorp.com/terraform/0.11.10/terraform_0.11.10_darwin_amd64.zip
unzip terraform*
sudo mv terraform /usr/local/bin/
# For linux
wget https://releases.hashicorp.com/terraform/0.11.10/terraform_0.11.10_linux_amd64.zip
unzip terraform*
sudo mv terraform /usr/local/bin/
- Before continuing, make sure you have create a new user (other than the
root user of your AWS account) in IAM, giving it enough permissions.
For simplicity you can just assign
AdministratorAccess
to the group this user belongs to. With more detailed permissions, you will have to be sure you also haveAmazonEKSClusterPolicy
andAmazonEKSServicePolicy
for this user. - Use editor to open file
~/.aws/credentials
and edit it with your AWS keys associated with the user mentioned above such as:[default] aws_access_key_id = XXX aws_secret_access_key = XXX
You can confirm that Terraform is installed correctly by deploying the most simple infrastructure provision. From the Terraform manual:
Prepare an example file
# example.tf
provider "aws" {
region = "us-east-1"
}
resource "aws_instance" "example" {
ami = "ami-2757f631"
instance_type = "t2.micro"
}
Then run the following command to be sure the Terraform command is working.
terraform init
terraform apply
# then verify instance creation on AWS console
terraform destroy
Note that Terraform will automatically search for saved API credentials (for example, in ~/.aws/credentials)
In our next step, we will be deploying infrastructure based on the Terraform EKS tutorial.
Steps to provision AWS EKS:
- Provision an EKS cluster with IAM roles, Security Groups and VPC
- Deploy worker nodes with Launch Configuration, Autoscaling Group, IAM Roles and Security Groups
- Connect to EKS
More detailed steps can follow here and here, more a perhaps more detailed version here (which is based on the first one).
For now we will follow the configs from the last link. As it should be more easier and has more related info.
See https://www.terraform.io/docs/providers/aws/guides/eks-getting-started.html for full guide.
This step requires cloning a git repository which contains Terraform templates.
Firstly clone this repo and go to folder eks-demo
:
git clone https://github.com/liufuyang/terraform-course.git
cd terraform-course/eks-demo
Note: the following guide assumes you are at the terraform file folder
eks-demo
from the repo mentioned above
At the end of this guide you will be asked to come back to this tidb-operator
folder
to continue the Tidb demo.
Download kubectl, the command line tool for Kubernetes
# For macOS
curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/darwin/amd64/kubectl
chmod +x kubectl
ln -s $(pwd)/kubectl /usr/local/bin/kubectl
# For linux
curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl
chmod +x kubectl
ln -s $(pwd)/kubectl /usr/local/bin/kubectl
A tool to use AWS IAM credentials to authenticate to a Kubernetes cluster.
# For macOS
wget -O heptio-authenticator-aws https://github.com/kubernetes-sigs/aws-iam-authenticator/releases/download/v0.3.0/heptio-authenticator-aws_0.3.0_darwin_amd64
chmod +x heptio-authenticator-aws
ln -s $(pwd)/heptio-authenticator-aws /usr/local/bin/heptio-authenticator-aws
# For linux
wget -O heptio-authenticator-aws https://github.com/kubernetes-sigs/aws-iam-authenticator/releases/download/v0.3.0/heptio-authenticator-aws_0.3.0_linux_amd64
chmod +x heptio-authenticator-aws
ln -s $(pwd)/heptio-authenticator-aws /usr/local/bin/heptio-authenticator-aws
Choose your region. EKS is not available in every region, use the Region Table to check whether your region is supported: https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/
Make changes in providers.tf accordingly (region, optionally profile)
Start AWS EKS service and create a fully operational Kubernetes environment on your AWS cloud.
terraform init
terraform apply # This step may take more than 10 minutes to finish. Just wait patiently.
Export the configuration of the EKS cluster to .config
. This will allow the kubectl
command line client to access our newly deployed cluster.
mkdir .config
terraform output kubeconfig > .config/ekskubeconfig
# Save output in ~/.kube/config and then use the following env prarm
export KUBECONFIG=$KUBECONFIG:$(pwd)/.config/ekskubeconfig
# Remember to run the above command again with the right folders when you open a new shell window later for TiDB related works.
Setup a ConfigMap on Kubernetes so the node instances can have proper IAM roles.
terraform output config-map-aws-auth > .config/config-map-aws-auth.yaml
# save output in config-map-aws-auth.yaml
kubectl apply -f .config/config-map-aws-auth.yaml
Your kubectl
command line client should now be installed and configured. Check the installed services and kubernetes nodes:
kubectl get svc
kubectl get nodes
More info about getting started with Amazon EKShere
To allow an easy way of monitoring what's happening on your Kubernetes
cluster, it is very helpful to have a kubernetes-dashboard
installed.
And you can then easily monitor every pod or service of your Kubernetes cluster in browser.
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/heapster.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/influxdb.yaml
kubectl apply -f eks-admin-service-account.yaml
kubectl apply -f eks-admin-cluster-role-binding.yaml
Now you can monitoring your Kubernetes cluster in browser:
# output a token for logging in kubernetes-dashboard
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep eks-admin | awk '{print $1}')
kubectl proxy
Then open http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/
And use the token output from previous command to login.
More info see here.
Amazon EKS clusters are not created with any storage classes. You must define storage classes for your cluster to use and you should define a default storage class for your persistent volume claims. For the demo purpose,
we simply use the default gp2
storage class.
gp2
is a general purpose SSD volume that balances price and performance for a wide variety of workloads.
kubectl create -f gp2-storage-class.yaml
kubectl get storageclass
More info to see: here , here and here
Congratulations, now you have a Kubernetes cloud running on AWS.
Then from this stage, you should be able to follow the Tidb-Operator guide from section Install Helm
, Deploy TiDB Operator
and so on.
Or just continue with our guide here, simply do:
curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get > get_helm.sh
chmod 700 get_helm.sh && ./get_helm.sh
Now we will be working under folder
tidb-operator
folder ofpingcap/tidb-operator
repo.
# clone repo, if you haven't done it
git clone https://github.com/pingcap/tidb-operator.git
cd tidb-operator
Please note that if you have opened a new shell or terminal since you have
worked previously worked in the directory eks-demo
, make sure you have the same KUBECONFIG
setup again in the current shell,
otherwise kubectl
command will not be able to
connect onto your newly created cluster.
TiDB Operator manages TiDB clusters on Kubernetes and automates tasks related to operating a TiDB cluster. It makes TiDB a truly cloud-native database.
Uncomment the scheduler.kubeSchedulerImage
in values.yaml
, set it to the same as your kubernetes cluster version.
If you already have helm installed, you can continue here to deploy tidb-operator:
# within directory tidb-operator of repo https://github.com/pingcap/tidb-operator
kubectl create serviceaccount tiller --namespace kube-system
kubectl apply -f ./manifests/tiller-rbac.yaml
helm init --service-account tiller --upgrade
# verify:
kubectl get pods --namespace kube-system | grep tiller
# Deploy TiDB operator:
kubectl apply -f ./manifests/crd.yaml
kubectl apply -f ./manifests/gp2-storage.yaml
helm install ./charts/tidb-operator -n tidb-admin --namespace=tidb-admin
# verify:
kubectl get pods --namespace tidb-admin -o wide
See issue: pingcap#91
Basically in un-comment all the max-open-files
settings
in file charts/tidb-cluster/templates/tikv-configmap.yaml
and set everyone as:
max-open-files = 1024
# Deploy your first TiDB cluster
helm install ./charts/tidb-cluster -n tidb --namespace=tidb --set pd.storageClassName=gp2,tikv.storageClassName=gp2
# Or if something goes wrong later and you want to update the deployment, use command:
# helm upgrade tidb ./charts/tidb-cluster --namespace=tidb --set pd.storageClassName=gp2,tikv.storageClassName=gp2
# verify and wait until tidb-initializer pod status becomes completed:
kubectl get pods --namespace tidb -o wide
Then keep watching output of:
watch "kubectl get svc -n tidb"
When you see demo-tidb
appear, you can Control + C
to stop watching. Then the service is ready to connect to!
kubectl run -n tidb mysql-client --rm -i --tty --image mysql -- mysql -P 4000 -u root -h $(kubectl get svc demo-tidb -n tidb -o jsonpath='{.spec.clusterIP}') -p
Or you can forward ports and run a mysql client locally:
kubectl -n tidb port-forward demo-tidb-0 4000:4000 &>/tmp/port-forward.log &
mysql -h 127.0.0.1 -u root -P 4000 --default-character-set=utf8 -p
Try out some mysql commands:
select tidb_version();
It works! 🎉
If you did not specify a password in helm, set one now:
SET PASSWORD FOR 'root'@'%' =
TiDB cluster by default ships with a Grafana monitoring page which is very useful for DevOps. You can view it as well now. Simply do a kubectl port-forward on the demo-monitor pod.
# Note: you will check your cluster and use the actual demo-monitor pod's name instead of the example blow:
kubectl port-forward demo-monitor-5bc85fdb7f-qwl5h 3000 &
Then visit localhost:3000 and login with default
username and password as admin
, admin
.
(Optional) Or use a LoadBalancer and visit the site with public IP.
However this method is not recommended for demo here as it will create
extra resources in our AWS VPC, making the later terraform destroy
command cannot clean up all the resources.
If you know what you are doing, then continue. Otherwise use the
kubectl port-forward
command mentioned above.
# setting monitor.server.type = LoadBalancer in charts/tidb-cluster/values.yaml
# then do helm update again to update the tidb release
# then check loadbalancer external ip via command
kubectl get svc -n tidb
A good way of testing database is to use the TPC-H
Benchmark.
One can easily generate some data and load into your TiDB via
PingCap's tidb-bench
repo
Please refer to the README of the tidb-bench
to see how to easily generate some test data and load into TiDB.
And test TiDB with SQL queries in the queries
folder
For example, with our 3 node EKS Kubernetes cluster created above, with TPC-H data generated with scale factor 1 (local data file around 1GB), the TiDB (V2.0) should be able to finish the first TPC-H query with 15 seconds.
With a single command we can easily scale out the TiDB cluster. To scale out TiKV:
helm upgrade tidb charts/tidb-cluster --set pd.storageClassName=gp2,tikv.storageClassName=gp2,tikv.replicas=5,tidb.replicas=3
Now the number of TiKV pods is increased from the default 3 to 5. You can check it with:
kubectl get po -n tidb
We have noticed with a 5 node EKS Kubernetes cluster, 1G TPC-H data, the first TPC-H query can be finished within 10 seconds.
At the end of the demo, please make sure all the resources created by Kubernetes are removed (LoadBalancers, Security groups), so you get a big bill from AWS.
Simply run:
terraform destroy
(Do this command at the end to clean up, you don't have to don't do it now!)