Eli's Blog

1. Helm

Helm:让应用管理(Deployment、Service等)可配置,能动态生成。通过动态生成的k8s资源清单文件 (deployment.yaml, service.yaml),然后调用kubectl自动执行k8s资源部署。

Helm 包管理工具,是部署环境的流程封装

Helm 两个重要概念:

  • chart: 创建一个应用的信息集合,包含各种kubernetes对象的配置模板、参数定义、依赖关系、文档说明等。chart是应用部署的自包含逻辑单元,即yum中的安装包
  • release: chart的运行实例。当chart被安装到kubernetes中,就生成一个release。chart能够多次安装到同一个集群,每次安装都是一个realease

helm 包含两个组件:Helm 客户端 和 Tiller 服务器

helm

Helm客户端: 负责chart和release的创建和管理、和Tiller的交互

Tiller服务器:运行在kubernetes集群节点中,处理Helm客户端请求,与API Server交互

1.1 Helm 部署

安装包下载地址:https://github.com/helm/helm/releases

1
2
3
wget https://get.helm.sh/helm-v2.16.10-linux-amd64.tar.gz
tar zxvf helm-v2.16.10-linux-amd64.tar.gz
cp linux-amd64/helm /usr/local/bin

安装Tiller: k8s APIServer开启了RBAC访问控制,在创建Tiller需要使用service account: tiller,并分配合适的角色给它

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# tiller-rbac-config.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: tiller
namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: tiller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: tiller
namespace: kube-system
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 创建RBAC
$ kubectl create -f tiller-rbac-config.yaml

# 部署 tiller 服务器
$ helm init --service-account tiller --skip-refresh

# tiller 服务器,namespace 为 kube-system
$ kubectl get pod -n kube-system | grep tiller
NAME READY STATUS RESTARTS AGE
tiller-deploy-6845b7d56c-2wk2x 1/1 Running 0 31s

$ helm version
Client: &version.Version{SemVer:"v2.16.10", GitCommit:"bceca24a91639f045f22ab0f41e47589a932cf5e", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.16.10", GitCommit:"bceca24a91639f045f22ab0f41e47589a932cf5e", GitTreeState:"clean"}

部署 helm v3.3:

1
2
3
4
5
6
7
$ wget https://get.helm.sh/helm-v3.3.1-linux-amd64.tar.gz
$ tar zxvf helm-v3.3.1-linux-amd64.tar.gz
$ cp linux-amd64/helm /usr/local/bin/helm

$ helm repo add stable https://kubernetes-charts.storage.googleapis.com/

$ helm repo update

命令汇总:

命令 说明
helm search hub xxx 在Helm Hub上搜索Chart
helm search repo repo_name 在本地配置的Repo中搜索Chart
helm install release_name chart_reference chart一共有5种reference
helm list 查看已部署的release
helm status release_name 查看release信息
helm upgrade release_name chart_reference 修改chart信息后升级release
helm history release_name 查看release的更新历史记录
helm rollback release_name revision 回滚操作
helm uninstall release_name 卸载release

1.2 Helm 自定义模板

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
$ mkdir helm-demo && cd helm-demo

# 创建自描述文件
$ cat > Chart.yaml <<EOF
name: hello
version: 1.0.0
EOF

# 创建目标文件,用于生成 kubernetes 资源清单 manifests
$ mkdir templates
$ cat > ./templates/deployment.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-world
spec:
replicas: 3
selector:
matchLabels:
app: hello-world
template:
metadata:
labels:
app: hello-world
spec:
containers:
- name: hello-world
image: hub.elihe.io/library/nginx:v1
ports:
- containerPort: 80
protocol: TCP
EOF

# 创建 svc
$ cat > ./templates/service.yaml <<EOF
apiVersion: v1
kind: Service
metadata:
name: hello-world
spec:
type: NodePort
ports:
- port: 80
targetPort: 80
protocol: TCP
selector:
app: hello-world
EOF

# 安装
#$ helm install . --name hello
$ helm install hello .
NAME: hello
LAST DEPLOYED: Thu Oct 15 10:35:57 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None

# 查询
$ helm list
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
hello default 1 2020-10-15 10:35:57.015330177 +0800 CST deployed hello-1.0.1

通过动态配置项:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# 动态配置
$ cat > values.yaml <<EOF
image:
repository: hub.elihe.io/test/nginx
tag: v2
EOF

# 动态模板
$ cat > ./templates/deployment.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-world
spec:
replicas: 1
selector:
matchLabels:
app: hello-world
template:
metadata:
labels:
app: hello-world
spec:
containers:
- name: hello-world
image: {{ .Values.image.repository }}:{{ .Values.image.tag }}
ports:
- containerPort: 80
protocol: TCP
EOF

# 升级版本
$ helm upgrade hello -f values.yaml .

# 指定版本升级
$ helm upgrade --set image.tag='v3' hello .

# 历史
$ helm history hello
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
1 Thu Oct 15 10:35:57 2020 superseded hello-1.0.1 Install complete
2 Thu Oct 15 10:40:11 2020 superseded hello-1.0.1 Upgrade complete
3 Thu Oct 15 10:40:33 2020 deployed hello-1.0.1 Upgrade complete

# 回滚
$ helm rollback hello 2
Rollback was a success.

# 卸载
$ helm uninstall --keep-history hello

# 还原
$ helm rollback hello 1

# 彻底删除
$ helm uninstall hello

debug:

1
2
# 配置检查和预生成配置清单
$ helm install . --dry-run --debug --set image.tag=v2

2. 部署 Dashboard

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
$ mkdir dashboard && cd dashboard

$ helm repo update

$ helm repo list
NAME URL
stable https://kubernetes-charts.storage.googleapis.com
local http://127.0.0.1:8879/charts

$ helm fetch stable/kubernetes-dashboard
$ tar zxvf kubernetes-dashboard-1.11.1.tgz
$ cd kubernetes-dashboard

# 参数设置
$ cat > kubernetes-dashboard.yaml <<EOF
image:
repository: k8s.gcr.io/kubernetes-dashboard-amd64
tag: v1.8.3
ingress:
enable: true
hosts:
- k8s.frognew.com
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: true
nginx.ingress.kubernetes.io/backend-protocol: HTTPS
tls:
- secretName: frognew-com-tls-secret
hosts:
- k8s.frognew.com
rbac:
clusterAdminRole: true
EOF

# 安装
$ helm install kubernetes-dashboard . \
--namespace kube-system \
-f kubernetes-dashboard.yaml

# 容器已运行
$ kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
kubernetes-dashboard-7cfd66fc8b-8t79v 1/1 Running 0 37s

# 查看访问
$ kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard ClusterIP 10.98.142.181 <none> 443/TCP 66s

# 改为NodePort访问
$ kubectl edit svc kubernetes-dashboard -n kube-system
type: NodePort

$ kubectl get svc -n kube-system
kubernetes-dashboard NodePort 10.101.30.189 <none> 443:31667/TCP 3m11s

# 获取令牌访问 token
$ kubectl get secret -n kube-system | grep kubernetes-dashboard-token
kubernetes-dashboard-token-bbt69 kubernetes.io/service-account-token 3 3m12s

$ kubectl describe secret kubernetes-dashboard-token-bbt69 -n kube-system
token: eyJhbGciOiJSUzI1NiIsImtpZCI6IlduNEdhTUJxOWtXbFhwdlhRSzhEMGFRemdJR0duQl9FNm9Rc2d0ekREQkEifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZC10b2tlbi1iYnQ2OSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjBlMTZlZjcyLWM5YjgtNDViMC05OTEzLThhNzY2NmY2ZDQzNyIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLXN5c3RlbTprdWJlcm5ldGVzLWRhc2hib2FyZCJ9.4FxXZN-Gc6mpd50sl7Wrm_ZjO5T53LrMa30MYMAHubIxOSgIh5HBvpdq5SxgQg2-XGTWZy8yZvxdmC53XOl5zqq-7RMKKjTv-Qa3O_KcHRPpnAOjj9aXvRbGdSlc5Y4D2nkysRKjWca8NjSrTXOzNHMFK0CHEIqVP-GFrKUMWmZRGYiwIoaBBKgTaS-KM3vF2Be94U2f1-ybFloOsAgEijqhUWrxpBgvXYfAmWjH4tdjCgo_1YEFPYUuUS9hq_VifdvWma9ZQthKbWplik9nuG2g-9o_xS0en5rnbxJQFfoAl5iypEi6zJiKgFoGwJsl5ScLFhpDaYN3QNhOnHhJrA

部署Dashboard2.0:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
# 卸载V1.8.3
$ helm uninstall kubernetes-dashboard --namespace kube-system

# 使用kubectl安装
$ wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.4/aio/deploy/recommended.yaml

# 安装
$ kubectl apply -f recommended.yaml

$ kubectl get pod -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
dashboard-metrics-scraper-6b4884c9d5-8j778 1/1 Running 0 38s
kubernetes-dashboard-7d8574ffd9-wff2g 1/1 Running 0 38s

$ kubectl get svc -n kubernetes-dashboard
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dashboard-metrics-scraper ClusterIP 10.99.116.101 <none> 8000/TCP 115s
kubernetes-dashboard ClusterIP 10.111.190.197 <none> 443/TCP 116s

# 改为NodePort
$ kubectl edit svc kubernetes-dashboard -n kubernetes-dashboard
type: NodePort

$ kubectl get svc -n kubernetes-dashboard
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dashboard-metrics-scraper ClusterIP 10.99.116.101 <none> 8000/TCP 2m40s
kubernetes-dashboard NodePort 10.111.190.197 <none> 443:32202/TCP 2m41s

# 开放账号权限
cat > dashboard-admin.yaml <<EOF
# Creating a Service Account
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard

---
# Creating a ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard
EOF

# 开放管理员权限
$ kubectl apply -f dashboard-admin.yaml

# 访问token
$ kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}')
Name: admin-user-token-zjkxs
Namespace: kubernetes-dashboard
Labels: <none>
Annotations: kubernetes.io/service-account.name: admin-user
kubernetes.io/service-account.uid: af11f2f3-613e-4bc5-959b-4591e3ada6df

Type: kubernetes.io/service-account-token

Data
====
ca.crt: 1025 bytes
namespace: 20 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6IlduNEdhTUJxOWtXbFhwdlhRSzhEMGFRemdJR0duQl9FNm9Rc2d0ekREQkEifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyLXRva2VuLXpqa3hzIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiJhZjExZjJmMy02MTNlLTRiYzUtOTU5Yi00NTkxZTNhZGE2ZGYiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZXJuZXRlcy1kYXNoYm9hcmQ6YWRtaW4tdXNlciJ9.NRMwYGUtsf0v8rL3aZQDmi1lTAFMp1m2xEvAO6zavtFFo6HJzbpF_ReSssgWeK5LLk6sbOXVUx19O0wnASSPKg7JXiXBBGyb_qHkMdD5p2yc5ggGJu_MjE_0kXS-0OvSMS20Dtv1BiZiWB-eNEy3xxTorivG2Zah8-ART5J1HtqHauxxyQr21pHfQ9XlmOlby3MQVelIbQ1e7-EZemOSggcQI0rlpWlU_OPiakksoJGEcwr0xK7kypLnxG4AjM9x9fgjIBft30c4tfwMDXzYiB5ZwwDP2cHRiYN6fnE9XdJmrGBVAL4SgTabXFz2DOfOFpsbWkcDNdOBBWsZHzvUww

# 卸载
$ kubectl delete -f dashboard-admin.yaml
$ kubectl delete -f recommended.yaml

3. Prometheus

3.1 组件说明

  • MetricServer: k8s 集群资源使用情况的集合器,收集数据给 k8s 集群内使用,如kubectl, hpa, scheduler等 (支持kubectl top node等操作)
  • PrometheusOperator: 系统监控和警报工具箱,用来存储监控数据
  • NodeExporter: 各个node的关键度量指标状态数据
  • KubeStateMetrics: 收集k8s集群内资源对象数据,制定告警规则
  • Prometheus: 采用pull方式收集apiserver, scheduler, controller-manager, kubelet组件数据,通过http协议传输
  • Grafana: 可视化数据统计和监控平台

3.2 构建记录

1
2
3
4
5
6
7
$ mkdir prometheus && cd promethues

$ git clone https://github.com/coreos/kube-prometheus.git
$ cd kube-prometheus/manifests

# 当前k8s版本为 v1.18.6, 切换分支 release-0.6
$ git checkout release-0.6

修改 grafana-service.yaml, 开启NodePort方式:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
apiVersion: v1
kind: Service
metadata:
labels:
app: grafana
name: grafana
namespace: monitoring
spec:
type: NodePort # add
ports:
- name: http
port: 3000
targetPort: http
nodePort: 30100 # add
selector:
app: grafana

修改 prometheus-service.yaml, 开启NodePort方式:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
apiVersion: v1
kind: Service
metadata:
labels:
prometheus: k8s
name: prometheus-k8s
namespace: monitoring
spec:
type: NodePort # add
ports:
- name: web
port: 9090
targetPort: web
nodePort: 30200 # add
selector:
app: prometheus
prometheus: k8s
sessionAffinity: ClientIP

修改 alertmanager-service.yaml, 开启NodePort方式:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
apiVersion: v1
kind: Service
metadata:
labels:
alertmanager: main
name: alertmanager-main
namespace: monitoring
spec:
type: NodePort # add
ports:
- name: web
port: 9093
targetPort: web
nodePort: 30300 # add
selector:
alertmanager: main
app: alertmanager
sessionAffinity: ClientIP

获取需要的镜像:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ find . -type f | xargs grep 'image:' | awk '{print $3}' | sed '/^[ ]*$/d' | sort | uniq
directxman12/k8s-prometheus-adapter:v0.7.0
grafana/grafana:7.1.0
quay.io/coreos/kube-rbac-proxy:v0.4.1
quay.io/coreos/kube-state-metrics:v1.9.5
quay.io/coreos/prometheus-operator:v0.40.0
quay.io/prometheus/alertmanager:v0.21.0
quay.io/prometheus/node-exporter:v0.18.1
quay.io/prometheus/prometheus:v2.20.0

# 先手动拉取镜像
docker pull quay.io/coreos/kube-rbac-proxy:v0.4.1
docker pull quay.io/coreos/kube-state-metrics:v1.9.5
docker pull quay.io/coreos/prometheus-operator:v0.40.0
docker pull quay.io/prometheus/alertmanager:v0.21.0
docker pull quay.io/prometheus/node-exporter:v0.18.1
docker pull quay.io/prometheus/prometheus:v2.20.0

执行安装:

1
2
3
4
5
6
7
# Create the namespace and CRDs, and then wait for them to be availble before creating the remaining resources
$ kubectl create -f manifests/setup
$ until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
$ kubectl create -f manifests/

# teardown the stack
$ kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup

安装后检查:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
$ kubectl get pod -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 6m17s
alertmanager-main-1 2/2 Running 0 6m16s
alertmanager-main-2 2/2 Running 0 6m16s
grafana-67dfc5f687-vqfbh 1/1 Running 0 6m7s
kube-state-metrics-69d4c7c69d-2lmfl 3/3 Running 0 6m6s
node-exporter-j9nzx 2/2 Running 0 6m4s
node-exporter-lwmkw 2/2 Running 0 6m3s
node-exporter-p5sl8 2/2 Running 0 6m3s
prometheus-adapter-66b855f564-qvs8x 1/1 Running 0 5m53s
prometheus-k8s-0 3/3 Running 1 5m46s
prometheus-k8s-1 3/3 Running 1 5m46s
prometheus-operator-75c98bcfd7-smmwd 2/2 Running 0 8m22s

$ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-master 321m 16% 1329Mi 70%
k8s-node01 190m 9% 1062Mi 56%
k8s-node02 961m 48% 1011Mi 53%

$ kubectl top pod -n monitoring
NAME CPU(cores) MEMORY(bytes)
alertmanager-main-0 7m 22Mi
alertmanager-main-1 11m 23Mi
alertmanager-main-2 9m 24Mi
grafana-67dfc5f687-vqfbh 25m 25Mi
kube-state-metrics-69d4c7c69d-2lmfl 2m 33Mi
node-exporter-j9nzx 58m 19Mi
node-exporter-lwmkw 5m 18Mi
node-exporter-p5sl8 5m 13Mi
prometheus-adapter-66b855f564-qvs8x 4m 18Mi
prometheus-k8s-0 31m 235Mi
prometheus-k8s-1 26m 195Mi
prometheus-operator-75c98bcfd7-smmwd 1m 34Mi

$ kubectl get svc -n monitoring
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-main NodePort 10.105.101.126 <none> 9093:30300/TCP 9m37s
alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 9m37s
grafana NodePort 10.100.132.19 <none> 3000:30100/TCP 9m26s
kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 9m25s
node-exporter ClusterIP None <none> 9100/TCP 9m25s
prometheus-adapter ClusterIP 10.101.16.41 <none> 443/TCP 9m12s
prometheus-k8s NodePort 10.101.33.228 <none> 9090:30200/TCP 9m10s
prometheus-operated ClusterIP None <none> 9090/TCP 9m4s
prometheus-operator ClusterIP None <none> 8443/TCP 11m

访问 promethueshttp://192.168.31.40:30200

1
sum by (pod_name)(rate(container_cpu_usage_seconds_total{image!=""}[1m] ))

访问 Grafana: http://192.168.31.40:30100

1
admin/admin

grafana

3. Horizontal Pod Autoscaling

HPA 可以根据CPU利用率自动伸缩一个 Replication Controller、Deployment或者 ReplicaSet中的Pod数量

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
cat > hpa.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: php-apache
spec:
replicas: 1
selector:
matchLabels:
app: apache
template: # Pod
metadata:
labels:
app: apache
spec:
containers:
- name: php-apache
image: gcr.io/google_containers/hpa-example
ports:
- containerPort: 80
resources:
requests:
cpu: 0.1
memory: 32Mi
---
apiVersion: v1
kind: Service
metadata:
name: php-apache
spec:
type: ClusterIP
selector:
app: apache
ports:
- name: http
port: 80
targetPort: 80
EOF

# 启动
$ kubectl apply -f hpa.yaml

$ kubectl top pod
NAME CPU(cores) MEMORY(bytes)
php-apache-86d4bcdcd9-wlvs5 1/1 Running 0 29m

# 创建HPA控制器
$ kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10

# 查看数据释放统计到了
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
php-apache Deployment/php-apache 0%/50% 1 10 1 5m

# 增加负载,查看负载数量 (新开一个窗口)
$ kubectl run -i --tty load-generator --image=busybox /bin/sh
$ while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done

# 监控
kubectl get hpa -w
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
php-apache Deployment/php-apache <unknown>/50% 1 10 0 7s
php-apache Deployment/php-apache <unknown>/50% 1 10 1 15s
php-apache Deployment/php-apache 0%/50% 1 10 1 4m3s
php-apache Deployment/php-apache 0%/50% 1 10 1 5m19s
php-apache Deployment/php-apache 1%/50% 1 10 1 19m
php-apache Deployment/php-apache 0%/50% 1 10 1 20m
php-apache Deployment/php-apache 0%/50% 1 10 1 25m
php-apache Deployment/php-apache 378%/50% 1 10 1 28m
php-apache Deployment/php-apache 378%/50% 1 10 4 28m
php-apache Deployment/php-apache 467%/50% 1 10 8 28m

# 尝试新作 Pod
$ kubectl get pod -w
NAME READY STATUS RESTARTS AGE
load-generator 1/1 Running 0 45m
php-apache-86d4bcdcd9-wlvs5 1/1 Running 0 29m
php-apache-86d4bcdcd9-7cjmm 0/1 Pending 0 0s
php-apache-86d4bcdcd9-7cjmm 0/1 Pending 0 0s
php-apache-86d4bcdcd9-dr2rg 0/1 Pending 0 0s
php-apache-86d4bcdcd9-9srl5 0/1 Pending 0 0s
php-apache-86d4bcdcd9-dr2rg 0/1 Pending 0 0s
php-apache-86d4bcdcd9-9srl5 0/1 Pending 0 0s
php-apache-86d4bcdcd9-dr2rg 0/1 ContainerCreating 0 0s
php-apache-86d4bcdcd9-9srl5 0/1 ContainerCreating 0 1s
php-apache-86d4bcdcd9-7cjmm 0/1 ContainerCreating 0 1s
php-apache-86d4bcdcd9-hzf8h 0/1 Pending 0 0s
php-apache-86d4bcdcd9-m4tp6 0/1 Pending 0 0s
php-apache-86d4bcdcd9-hzf8h 0/1 Pending 0 0s
php-apache-86d4bcdcd9-5bfp8 0/1 Pending 0 0s
php-apache-86d4bcdcd9-m4tp6 0/1 Pending 0 0s
php-apache-86d4bcdcd9-5bfp8 0/1 Pending 0 0s
php-apache-86d4bcdcd9-8scwl 0/1 Pending 0 0s
php-apache-86d4bcdcd9-8scwl 0/1 Pending 0 0s
php-apache-86d4bcdcd9-hzf8h 0/1 ContainerCreating 0 0s
php-apache-86d4bcdcd9-m4tp6 0/1 ContainerCreating 0 0s
php-apache-86d4bcdcd9-5bfp8 0/1 ContainerCreating 0 0s
php-apache-86d4bcdcd9-8scwl 0/1 ContainerCreating 0 0s
php-apache-86d4bcdcd9-rsg9f 0/1 Pending 0 0s
php-apache-86d4bcdcd9-z6qkt 0/1 Pending 0 0s
php-apache-86d4bcdcd9-rsg9f 0/1 Pending 0 0s
php-apache-86d4bcdcd9-z6qkt 0/1 Pending 0 0s
php-apache-86d4bcdcd9-rsg9f 0/1 ContainerCreating 0 1s
php-apache-86d4bcdcd9-z6qkt 0/1 ContainerCreating 0 3s

4. 资源限制

4.1 Pod

1
2
3
4
5
6
7
8
9
10
11
12
13
spec:
containers:
- name: php-apache
image: gcr.io/google_containers/hpa-example
ports:
- containerPort: 80
resources:
requests:
cpu: 0.1
memory: 32Mi
limits:
cpu: 200m
memory: 100Mi

4.2 名称空间

  1. 计算姿态配额
1
2
3
4
5
6
7
8
9
10
11
12
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-resource
namespace: spark-cluster
spec:
hard:
pods: 20
requests.cpu: 20
requests.memory: 100Gi
limits.cpu: 40
limits.memory: 200Gi
  1. 配置对象数量配额限制
1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: v1
kind: ResourceQuota
metadata:
name: object-counts
namespace: spark-cluster
spec:
hard:
configmaps: 10
persistentvolumeclaims: 4
replicationcontrollers: 20
secrets: 10
services: 10
services.loadbalancer: 2
  1. 配置CPU 和 内存的 LimitRange
1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: v1
kind: LimitRange
metadata:
name: mem-limit-range
spec:
limits:
- default:
memory: 50Gi
cpu: 5
defaulyRequest:
memory: 1Gi
cpu: 1
type: Container

5. EFK 日志

EFK: Elasticsearch + Fluentd + Kibana

ELFK: Elasticsearch + Logstash + Filebeat + Kibana

安装参考:https://www.digitalocean.com/community/tutorials/how-to-set-up-an-elasticsearch-fluentd-and-kibana-efk-logging-stack-on-kubernetes

5.1 创建 Namespace

1
2
3
4
5
6
7
8
9
10
11
12
13
$ mkdir efk && cd efk

$ cat > kube-logging.yaml <<EOF
kind: Namespace
apiVersion: v1
metadata:
name: kube-logging
EOF

$ kubectl create -f kube-logging.yaml

$ kubectl get ns | grep kube-logging
kube-logging Active 6s

5.2 ElasticSearch

5.2.1 创建无头服务

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
$ cat > elasticsearch_svc.yaml <<EOF
kind: Service
apiVersion: v1
metadata:
name: elasticsearch
namespace: kube-logging
labels:
app: elasticsearch
spec:
selector:
app: elasticsearch
clusterIP: None
ports:
- port: 9200
name: rest
- port: 9300
name: inter-node
EOF

$ kubectl create -f elasticsearch_svc.yaml

$ kubectl get services --namespace=kube-logging
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch ClusterIP None <none> 9200/TCP,9300/TCP 13s

5.2.2 创建PV

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
$ cat > elasticsearch_pv.ymal <<EOF
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfspv1
namespace: kube-logging
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs
nfs:
path: /nfs
server: 192.168.31.200
EOF

$ kubectl create -f elasticsearch_pv.ymal

$ kubectl get pv -n kube-logging
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
nfspv1 1Gi RWO Retain Available nfs 18s

5.2.3 安装ES

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
$ cat > elasticsearch_statefulset.yaml <<EOF
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: es-cluster
namespace: kube-logging
spec:
serviceName: elasticsearch
#replicas: 3
replicas: 1
selector:
matchLabels:
app: elasticsearch
template:
metadata:
labels:
app: elasticsearch
spec:
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:7.2.0
resources:
limits:
#cpu: 1000m
cpu: 400m
requests:
cpu: 100m
ports:
- containerPort: 9200
name: rest
protocol: TCP
- containerPort: 9300
name: inter-node
protocol: TCP
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
env:
- name: cluster.name
value: k8s-logs
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: discovery.type # test-bed
value: single-node
#- name: discovery.seed_hosts
#value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch"
#- name: cluster.initial_master_nodes
#value: "es-cluster-0,es-cluster-1,es-cluster-2"
- name: ES_JAVA_OPTS
#value: "-Xms512m -Xmx512m"
value: "-Xms256m -Xmx256m"
initContainers:
- name: fix-permissions
image: busybox
command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
securityContext:
privileged: true
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
- name: increase-vm-max-map
image: busybox
command: ["sysctl", "-w", "vm.max_map_count=262144"]
securityContext:
privileged: true
- name: increase-fd-ulimit
image: busybox
command: ["sh", "-c", "ulimit -n 65536"]
securityContext:
privileged: true
volumeClaimTemplates:
- metadata:
name: data
labels:
app: elasticsearch
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: nfs
resources:
requests:
storage: 1Gi
EOF

$ kubectl create -f elasticsearch_statefulset.yaml

# 监控创建进度
$ kubectl rollout status sts/es-cluster --namespace=kube-logging

$ kubectl get pod -n kube-logging
NAME READY STATUS RESTARTS AGE
es-cluster-0 1/1 Running 0 59s

# 监控日志
$ kubectl logs -f es-cluster-0 -n kube-logging

# 开启本地端口,测试服务
$ kubectl port-forward es-cluster-0 9200:9200 --namespace=kube-logging

$ curl http://localhost:9200/_cluster/state?pretty

5.3 Kibana

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
$ cat > kibana.yaml <<EOF
apiVersion: v1
kind: Service
metadata:
name: kibana
namespace: kube-logging
labels:
app: kibana
spec:
type: NodePort
ports:
- port: 5601
selector:
app: kibana
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: kibana
namespace: kube-logging
labels:
app: kibana
spec:
replicas: 1
selector:
matchLabels:
app: kibana
template:
metadata:
labels:
app: kibana
spec:
containers:
- name: kibana
image: docker.elastic.co/kibana/kibana:7.2.0
resources:
limits:
cpu: 1000m
requests:
cpu: 100m
env:
- name: ELASTICSEARCH_URL
value: http://elasticsearch:9200
ports:
- containerPort: 5601
EOF

$ kubectl create -f kibana.yaml

$ kubectl rollout status deployment/kibana --namespace=kube-logging

$ kubectl get pod -n kube-logging
NAME READY STATUS RESTARTS AGE
es-cluster-0 1/1 Running 0 13m
kibana-5749b5778b-zvtwn 1/1 Running 0 4m33s

$ kubectl get svc -n kube-logging
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch ClusterIP None <none> 9200/TCP,9300/TCP 89m
kibana NodePort 10.106.103.244 <none> 5601:30750/TCP 8s

$ curl http://192.168.1.40:30750

5.4 Fluentd

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
$ cat > fluentd.yaml <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluentd
namespace: kube-logging
labels:
app: fluentd
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fluentd
labels:
app: fluentd
rules:
- apiGroups:
- ""
resources:
- pods
- namespaces
verbs:
- get
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: fluentd
roleRef:
kind: ClusterRole
name: fluentd
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: fluentd
namespace: kube-logging
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
namespace: kube-logging
labels:
app: fluentd
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
serviceAccount: fluentd
serviceAccountName: fluentd
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch.kube-logging.svc.cluster.local"
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
- name: FLUENT_ELASTICSEARCH_SCHEME
value: "http"
- name: FLUENTD_SYSTEMD_CONF
value: disable
resources:
limits:
#memory: 512Mi
memory: 256Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
EOF

$ kubectl create -f fluentd.yaml

$ kubectl get ds -n kube-logging
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
fluentd 2 2 2 2 2 <none> 27s

5.5 Kibana 页面

kibana

kibana

kibana

kibana

5.6 测试

创建一个 Pod:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ cat > counter.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
name: counter
spec:
containers:
- name: count
image: busybox
args: [/bin/sh, -c,
'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 1; done']
EOF

$ kubectl create -f counter.yaml

6. 补充:Port说明:

Pod Template中的ports:

  • containerPort: 容器对外开发的端口

Service 中的 ports:

  • port: 监听请求,接收端口,绑定在ClusterIP上
  • targetPort: 指定Pod的接收端口,与containerPort绑定
  • nodePort: 类型为NodeType时,绑定在NodeIP上,未指定则随机给一个