1. Helm Helm:让应用管理(Deployment、Service等)可配置,能动态生成。通过动态生成的k8s资源清单文件 (deployment.yaml, service.yaml),然后调用kubectl自动执行k8s资源部署。
Helm 包管理工具,是部署环境的流程封装
Helm 两个重要概念:
chart: 创建一个应用的信息集合 ,包含各种kubernetes对象的配置模板、参数定义、依赖关系、文档说明等。chart是应用部署的自包含逻辑单元,即yum中的安装包
release: chart的运行实例 。当chart被安装到kubernetes中,就生成一个release。chart能够多次安装到同一个集群,每次安装都是一个realease
helm 包含两个组件:Helm 客户端 和 Tiller 服务器
Helm客户端: 负责chart和release的创建和管理、和Tiller的交互
Tiller服务器:运行在kubernetes集群节点中,处理Helm客户端请求,与API Server交互
1.1 Helm 部署 安装包下载地址:https://github.com/helm/helm/releases
1 2 3 wget https://get.helm.sh/helm-v2.16.10-linux-amd64.tar.gz tar zxvf helm-v2.16.10-linux-amd64.tar.gz cp linux-amd64/helm /usr/local /bin
安装Tiller: k8s APIServer开启了RBAC访问控制,在创建Tiller需要使用service account: tiller,并分配合适的角色给它
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 apiVersion: v1 kind: ServiceAccount metadata: name: tiller namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: tiller roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: tiller namespace: kube-system
1 2 3 4 5 6 7 8 9 10 11 12 13 14 $ kubectl create -f tiller-rbac-config.yaml $ helm init --service-account tiller --skip-refresh $ kubectl get pod -n kube-system | grep tiller NAME READY STATUS RESTARTS AGE tiller-deploy-6845b7d56c-2wk2x 1/1 Running 0 31s $ helm version Client: &version.Version{SemVer:"v2.16.10" , GitCommit:"bceca24a91639f045f22ab0f41e47589a932cf5e" , GitTreeState:"clean" } Server: &version.Version{SemVer:"v2.16.10" , GitCommit:"bceca24a91639f045f22ab0f41e47589a932cf5e" , GitTreeState:"clean" }
部署 helm v3.3:
1 2 3 4 5 6 7 $ wget https://get.helm.sh/helm-v3.3.1-linux-amd64.tar.gz $ tar zxvf helm-v3.3.1-linux-amd64.tar.gz $ cp linux-amd64/helm /usr/local /bin/helm $ helm repo add stable https://kubernetes-charts.storage.googleapis.com/ $ helm repo update
命令汇总:
命令
说明
helm search hub xxx
在Helm Hub上搜索Chart
helm search repo repo_name
在本地配置的Repo中搜索Chart
helm install release_name chart_reference
chart一共有5种reference
helm list
查看已部署的release
helm status release_name
查看release信息
helm upgrade release_name chart_reference
修改chart信息后升级release
helm history release_name
查看release的更新历史记录
helm rollback release_name revision
回滚操作
helm uninstall release_name
卸载release
1.2 Helm 自定义模板 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 $ mkdir helm-demo && cd helm-demo $ cat > Chart.yaml <<EOF name: hello version: 1.0.0 EOF $ mkdir templates $ cat > ./templates/deployment.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: hello-world spec: replicas: 3 selector: matchLabels: app: hello-world template: metadata: labels: app: hello-world spec: containers: - name: hello-world image: hub.elihe.io/library/nginx:v1 ports: - containerPort: 80 protocol: TCP EOF $ cat > ./templates/service.yaml <<EOF apiVersion: v1 kind: Service metadata: name: hello-world spec: type: NodePort ports: - port: 80 targetPort: 80 protocol: TCP selector: app: hello-world EOF $ helm install hello . NAME: hello LAST DEPLOYED: Thu Oct 15 10:35:57 2020 NAMESPACE: default STATUS: deployed REVISION: 1 TEST SUITE: None $ helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION hello default 1 2020-10-15 10:35:57.015330177 +0800 CST deployed hello-1.0.1
通过动态配置项:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 $ cat > values.yaml <<EOF image: repository: hub.elihe.io/test/nginx tag: v2 EOF $ cat > ./templates/deployment.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: hello-world spec: replicas: 1 selector: matchLabels: app: hello-world template: metadata: labels: app: hello-world spec: containers: - name: hello-world image: {{ .Values.image.repository }}:{{ .Values.image.tag }} ports: - containerPort: 80 protocol: TCP EOF $ helm upgrade hello -f values.yaml . $ helm upgrade --set image.tag='v3' hello . $ helm history hello REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION 1 Thu Oct 15 10:35:57 2020 superseded hello-1.0.1 Install complete 2 Thu Oct 15 10:40:11 2020 superseded hello-1.0.1 Upgrade complete 3 Thu Oct 15 10:40:33 2020 deployed hello-1.0.1 Upgrade complete $ helm rollback hello 2 Rollback was a success. $ helm uninstall --keep-history hello $ helm rollback hello 1 $ helm uninstall hello
debug:
1 2 $ helm install . --dry-run --debug --set image.tag=v2
2. 部署 Dashboard 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 $ mkdir dashboard && cd dashboard $ helm repo update $ helm repo list NAME URL stable https://kubernetes-charts.storage.googleapis.com local http://127.0.0.1:8879/charts $ helm fetch stable/kubernetes-dashboard $ tar zxvf kubernetes-dashboard-1.11.1.tgz $ cd kubernetes-dashboard $ cat > kubernetes-dashboard.yaml <<EOF image: repository: k8s.gcr.io/kubernetes-dashboard-amd64 tag: v1.8.3 ingress: enable: true hosts: - k8s.frognew.com annotations: nginx.ingress.kubernetes.io/ssl-redirect: true nginx.ingress.kubernetes.io/backend-protocol: HTTPS tls: - secretName: frognew-com-tls-secret hosts: - k8s.frognew.com rbac: clusterAdminRole: true EOF $ helm install kubernetes-dashboard . \ --namespace kube-system \ -f kubernetes-dashboard.yaml $ kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGE kubernetes-dashboard-7cfd66fc8b-8t79v 1/1 Running 0 37s $ kubectl get svc -n kube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes-dashboard ClusterIP 10.98.142.181 <none> 443/TCP 66s $ kubectl edit svc kubernetes-dashboard -n kube-system type : NodePort$ kubectl get svc -n kube-system kubernetes-dashboard NodePort 10.101.30.189 <none> 443:31667/TCP 3m11s $ kubectl get secret -n kube-system | grep kubernetes-dashboard-token kubernetes-dashboard-token-bbt69 kubernetes.io/service-account-token 3 3m12s $ kubectl describe secret kubernetes-dashboard-token-bbt69 -n kube-system token: eyJhbGciOiJSUzI1NiIsImtpZCI6IlduNEdhTUJxOWtXbFhwdlhRSzhEMGFRemdJR0duQl9FNm9Rc2d0ekREQkEifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZC10b2tlbi1iYnQ2OSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjBlMTZlZjcyLWM5YjgtNDViMC05OTEzLThhNzY2NmY2ZDQzNyIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLXN5c3RlbTprdWJlcm5ldGVzLWRhc2hib2FyZCJ9.4FxXZN-Gc6mpd50sl7Wrm_ZjO5T53LrMa30MYMAHubIxOSgIh5HBvpdq5SxgQg2-XGTWZy8yZvxdmC53XOl5zqq-7RMKKjTv-Qa3O_KcHRPpnAOjj9aXvRbGdSlc5Y4D2nkysRKjWca8NjSrTXOzNHMFK0CHEIqVP-GFrKUMWmZRGYiwIoaBBKgTaS-KM3vF2Be94U2f1-ybFloOsAgEijqhUWrxpBgvXYfAmWjH4tdjCgo_1YEFPYUuUS9hq_VifdvWma9ZQthKbWplik9nuG2g-9o_xS0en5rnbxJQFfoAl5iypEi6zJiKgFoGwJsl5ScLFhpDaYN3QNhOnHhJrA
部署Dashboard2.0:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 $ helm uninstall kubernetes-dashboard --namespace kube-system $ wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.4/aio/deploy/recommended.yaml $ kubectl apply -f recommended.yaml $ kubectl get pod -n kubernetes-dashboard NAME READY STATUS RESTARTS AGE dashboard-metrics-scraper-6b4884c9d5-8j778 1/1 Running 0 38s kubernetes-dashboard-7d8574ffd9-wff2g 1/1 Running 0 38s $ kubectl get svc -n kubernetes-dashboard NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE dashboard-metrics-scraper ClusterIP 10.99.116.101 <none> 8000/TCP 115s kubernetes-dashboard ClusterIP 10.111.190.197 <none> 443/TCP 116s $ kubectl edit svc kubernetes-dashboard -n kubernetes-dashboard type : NodePort$ kubectl get svc -n kubernetes-dashboard NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE dashboard-metrics-scraper ClusterIP 10.99.116.101 <none> 8000/TCP 2m40s kubernetes-dashboard NodePort 10.111.190.197 <none> 443:32202/TCP 2m41s cat > dashboard-admin.yaml <<EOF # Creating a Service Account apiVersion: v1 kind: ServiceAccount metadata: name: admin-user namespace: kubernetes-dashboard --- # Creating a ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: admin-user roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: admin-user namespace: kubernetes-dashboard EOF $ kubectl apply -f dashboard-admin.yaml $ kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}' ) Name: admin-user-token-zjkxs Namespace: kubernetes-dashboard Labels: <none> Annotations: kubernetes.io/service-account.name: admin-user kubernetes.io/service-account.uid: af11f2f3-613e-4bc5-959b-4591e3ada6df Type: kubernetes.io/service-account-token Data ==== ca.crt: 1025 bytes namespace: 20 bytes token: eyJhbGciOiJSUzI1NiIsImtpZCI6IlduNEdhTUJxOWtXbFhwdlhRSzhEMGFRemdJR0duQl9FNm9Rc2d0ekREQkEifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyLXRva2VuLXpqa3hzIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiJhZjExZjJmMy02MTNlLTRiYzUtOTU5Yi00NTkxZTNhZGE2ZGYiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZXJuZXRlcy1kYXNoYm9hcmQ6YWRtaW4tdXNlciJ9.NRMwYGUtsf0v8rL3aZQDmi1lTAFMp1m2xEvAO6zavtFFo6HJzbpF_ReSssgWeK5LLk6sbOXVUx19O0wnASSPKg7JXiXBBGyb_qHkMdD5p2yc5ggGJu_MjE_0kXS-0OvSMS20Dtv1BiZiWB-eNEy3xxTorivG2Zah8-ART5J1HtqHauxxyQr21pHfQ9XlmOlby3MQVelIbQ1e7-EZemOSggcQI0rlpWlU_OPiakksoJGEcwr0xK7kypLnxG4AjM9x9fgjIBft30c4tfwMDXzYiB5ZwwDP2cHRiYN6fnE9XdJmrGBVAL4SgTabXFz2DOfOFpsbWkcDNdOBBWsZHzvUww $ kubectl delete -f dashboard-admin.yaml $ kubectl delete -f recommended.yaml
3. Prometheus 3.1 组件说明
MetricServer: k8s 集群资源使用情况的集合器,收集数据给 k8s 集群内使用,如kubectl, hpa, scheduler等 (支持kubectl top node等操作)
PrometheusOperator: 系统监控和警报工具箱,用来存储监控数据
NodeExporter: 各个node的关键度量指标状态数据
KubeStateMetrics: 收集k8s集群内资源对象数据,制定告警规则
Prometheus: 采用pull方式收集apiserver, scheduler, controller-manager, kubelet组件数据,通过http协议传输
Grafana: 可视化数据统计和监控平台
3.2 构建记录 1 2 3 4 5 6 7 $ mkdir prometheus && cd promethues $ git clone https://github.com/coreos/kube-prometheus.git $ cd kube-prometheus/manifests $ git checkout release-0.6
修改 grafana-service.yaml, 开启NodePort方式:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 apiVersion: v1 kind: Service metadata: labels: app: grafana name: grafana namespace: monitoring spec: type: NodePort ports: - name: http port: 3000 targetPort: http nodePort: 30100 selector: app: grafana
修改 prometheus-service.yaml, 开启NodePort方式:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 apiVersion: v1 kind: Service metadata: labels: prometheus: k8s name: prometheus-k8s namespace: monitoring spec: type: NodePort ports: - name: web port: 9090 targetPort: web nodePort: 30200 selector: app: prometheus prometheus: k8s sessionAffinity: ClientIP
修改 alertmanager-service.yaml, 开启NodePort方式:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 apiVersion: v1 kind: Service metadata: labels: alertmanager: main name: alertmanager-main namespace: monitoring spec: type : NodePort ports: - name: web port: 9093 targetPort: web nodePort: 30300 selector: alertmanager: main app: alertmanager sessionAffinity: ClientIP
获取需要的镜像:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 $ find . -type f | xargs grep 'image:' | awk '{print $3}' | sed '/^[ ]*$/d' | sort | uniq directxman12/k8s-prometheus-adapter:v0.7.0 grafana/grafana:7.1.0 quay.io/coreos/kube-rbac-proxy:v0.4.1 quay.io/coreos/kube-state-metrics:v1.9.5 quay.io/coreos/prometheus-operator:v0.40.0 quay.io/prometheus/alertmanager:v0.21.0 quay.io/prometheus/node-exporter:v0.18.1 quay.io/prometheus/prometheus:v2.20.0 docker pull quay.io/coreos/kube-rbac-proxy:v0.4.1 docker pull quay.io/coreos/kube-state-metrics:v1.9.5 docker pull quay.io/coreos/prometheus-operator:v0.40.0 docker pull quay.io/prometheus/alertmanager:v0.21.0 docker pull quay.io/prometheus/node-exporter:v0.18.1 docker pull quay.io/prometheus/prometheus:v2.20.0
执行安装:
1 2 3 4 5 6 7 $ kubectl create -f manifests/setup $ until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo "" ; done $ kubectl create -f manifests/ $ kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup
安装后检查:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 $ kubectl get pod -n monitoring NAME READY STATUS RESTARTS AGE alertmanager-main-0 2/2 Running 0 6m17s alertmanager-main-1 2/2 Running 0 6m16s alertmanager-main-2 2/2 Running 0 6m16s grafana-67dfc5f687-vqfbh 1/1 Running 0 6m7s kube-state-metrics-69d4c7c69d-2lmfl 3/3 Running 0 6m6s node-exporter-j9nzx 2/2 Running 0 6m4s node-exporter-lwmkw 2/2 Running 0 6m3s node-exporter-p5sl8 2/2 Running 0 6m3s prometheus-adapter-66b855f564-qvs8x 1/1 Running 0 5m53s prometheus-k8s-0 3/3 Running 1 5m46s prometheus-k8s-1 3/3 Running 1 5m46s prometheus-operator-75c98bcfd7-smmwd 2/2 Running 0 8m22s $ kubectl top node NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% k8s-master 321m 16% 1329Mi 70% k8s-node01 190m 9% 1062Mi 56% k8s-node02 961m 48% 1011Mi 53% $ kubectl top pod -n monitoring NAME CPU(cores) MEMORY(bytes) alertmanager-main-0 7m 22Mi alertmanager-main-1 11m 23Mi alertmanager-main-2 9m 24Mi grafana-67dfc5f687-vqfbh 25m 25Mi kube-state-metrics-69d4c7c69d-2lmfl 2m 33Mi node-exporter-j9nzx 58m 19Mi node-exporter-lwmkw 5m 18Mi node-exporter-p5sl8 5m 13Mi prometheus-adapter-66b855f564-qvs8x 4m 18Mi prometheus-k8s-0 31m 235Mi prometheus-k8s-1 26m 195Mi prometheus-operator-75c98bcfd7-smmwd 1m 34Mi $ kubectl get svc -n monitoring NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE alertmanager-main NodePort 10.105.101.126 <none> 9093:30300/TCP 9m37s alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 9m37s grafana NodePort 10.100.132.19 <none> 3000:30100/TCP 9m26s kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 9m25s node-exporter ClusterIP None <none> 9100/TCP 9m25s prometheus-adapter ClusterIP 10.101.16.41 <none> 443/TCP 9m12s prometheus-k8s NodePort 10.101.33.228 <none> 9090:30200/TCP 9m10s prometheus-operated ClusterIP None <none> 9090/TCP 9m4s prometheus-operator ClusterIP None <none> 8443/TCP 11m
访问 promethues :http://192.168.31.40:30200
1 sum by (pod_name)(rate(container_cpu_usage_seconds_total{image!="" }[1m] ))
访问 Grafana : http://192.168.31.40:30100
3. Horizontal Pod Autoscaling HPA 可以根据CPU利用率自动伸缩一个 Replication Controller、Deployment或者 ReplicaSet中的Pod数量
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 cat > hpa.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: php-apache spec: replicas: 1 selector: matchLabels: app: apache template: # Pod metadata: labels: app: apache spec: containers: - name: php-apache image: gcr.io/google_containers/hpa-example ports: - containerPort: 80 resources: requests: cpu: 0.1 memory: 32Mi --- apiVersion: v1 kind: Service metadata: name: php-apache spec: type: ClusterIP selector: app: apache ports: - name: http port: 80 targetPort: 80 EOF $ kubectl apply -f hpa.yaml $ kubectl top pod NAME CPU(cores) MEMORY(bytes) php-apache-86d4bcdcd9-wlvs5 1/1 Running 0 29m $ kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10 $ kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE php-apache Deployment/php-apache 0%/50% 1 10 1 5m $ kubectl run -i --tty load-generator --image=busybox /bin/sh $ while true ; do wget -q -O- http://php-apache.default.svc.cluster.local; done kubectl get hpa -w NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE php-apache Deployment/php-apache <unknown>/50% 1 10 0 7s php-apache Deployment/php-apache <unknown>/50% 1 10 1 15s php-apache Deployment/php-apache 0%/50% 1 10 1 4m3s php-apache Deployment/php-apache 0%/50% 1 10 1 5m19s php-apache Deployment/php-apache 1%/50% 1 10 1 19m php-apache Deployment/php-apache 0%/50% 1 10 1 20m php-apache Deployment/php-apache 0%/50% 1 10 1 25m php-apache Deployment/php-apache 378%/50% 1 10 1 28m php-apache Deployment/php-apache 378%/50% 1 10 4 28m php-apache Deployment/php-apache 467%/50% 1 10 8 28m $ kubectl get pod -w NAME READY STATUS RESTARTS AGE load-generator 1/1 Running 0 45m php-apache-86d4bcdcd9-wlvs5 1/1 Running 0 29m php-apache-86d4bcdcd9-7cjmm 0/1 Pending 0 0s php-apache-86d4bcdcd9-7cjmm 0/1 Pending 0 0s php-apache-86d4bcdcd9-dr2rg 0/1 Pending 0 0s php-apache-86d4bcdcd9-9srl5 0/1 Pending 0 0s php-apache-86d4bcdcd9-dr2rg 0/1 Pending 0 0s php-apache-86d4bcdcd9-9srl5 0/1 Pending 0 0s php-apache-86d4bcdcd9-dr2rg 0/1 ContainerCreating 0 0s php-apache-86d4bcdcd9-9srl5 0/1 ContainerCreating 0 1s php-apache-86d4bcdcd9-7cjmm 0/1 ContainerCreating 0 1s php-apache-86d4bcdcd9-hzf8h 0/1 Pending 0 0s php-apache-86d4bcdcd9-m4tp6 0/1 Pending 0 0s php-apache-86d4bcdcd9-hzf8h 0/1 Pending 0 0s php-apache-86d4bcdcd9-5bfp8 0/1 Pending 0 0s php-apache-86d4bcdcd9-m4tp6 0/1 Pending 0 0s php-apache-86d4bcdcd9-5bfp8 0/1 Pending 0 0s php-apache-86d4bcdcd9-8scwl 0/1 Pending 0 0s php-apache-86d4bcdcd9-8scwl 0/1 Pending 0 0s php-apache-86d4bcdcd9-hzf8h 0/1 ContainerCreating 0 0s php-apache-86d4bcdcd9-m4tp6 0/1 ContainerCreating 0 0s php-apache-86d4bcdcd9-5bfp8 0/1 ContainerCreating 0 0s php-apache-86d4bcdcd9-8scwl 0/1 ContainerCreating 0 0s php-apache-86d4bcdcd9-rsg9f 0/1 Pending 0 0s php-apache-86d4bcdcd9-z6qkt 0/1 Pending 0 0s php-apache-86d4bcdcd9-rsg9f 0/1 Pending 0 0s php-apache-86d4bcdcd9-z6qkt 0/1 Pending 0 0s php-apache-86d4bcdcd9-rsg9f 0/1 ContainerCreating 0 1s php-apache-86d4bcdcd9-z6qkt 0/1 ContainerCreating 0 3s
4. 资源限制 4.1 Pod 1 2 3 4 5 6 7 8 9 10 11 12 13 spec: containers: - name: php-apache image: gcr.io/google_containers/hpa-example ports: - containerPort: 80 resources: requests: cpu: 0.1 memory: 32Mi limits: cpu: 200m memory: 100Mi
4.2 名称空间
计算姿态配额
1 2 3 4 5 6 7 8 9 10 11 12 apiVersion: v1 kind: ResourceQuota metadata: name: compute-resource namespace: spark-cluster spec: hard: pods: 20 requests.cpu: 20 requests.memory: 100Gi limits.cpu: 40 limits.memory: 200Gi
配置对象数量配额限制
1 2 3 4 5 6 7 8 9 10 11 12 13 apiVersion: v1 kind: ResourceQuota metadata: name: object-counts namespace: spark-cluster spec: hard: configmaps: 10 persistentvolumeclaims: 4 replicationcontrollers: 20 secrets: 10 services: 10 services.loadbalancer: 2
配置CPU 和 内存的 LimitRange
1 2 3 4 5 6 7 8 9 10 11 12 13 apiVersion: v1 kind: LimitRange metadata: name: mem-limit-range spec: limits: - default: memory: 50Gi cpu: 5 defaulyRequest: memory: 1Gi cpu: 1 type: Container
5. EFK 日志 EFK: Elasticsearch + Fluentd + Kibana
ELFK: Elasticsearch + Logstash + Filebeat + Kibana
安装参考:https://www.digitalocean.com/community/tutorials/how-to-set-up-an-elasticsearch-fluentd-and-kibana-efk-logging-stack-on-kubernetes
5.1 创建 Namespace 1 2 3 4 5 6 7 8 9 10 11 12 13 $ mkdir efk && cd efk $ cat > kube-logging.yaml <<EOF kind: Namespace apiVersion: v1 metadata: name: kube-logging EOF $ kubectl create -f kube-logging.yaml $ kubectl get ns | grep kube-logging kube-logging Active 6s
5.2 ElasticSearch 5.2.1 创建无头服务 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 $ cat > elasticsearch_svc.yaml <<EOF kind: Service apiVersion: v1 metadata: name: elasticsearch namespace: kube-logging labels: app: elasticsearch spec: selector: app: elasticsearch clusterIP: None ports: - port: 9200 name: rest - port: 9300 name: inter-node EOF $ kubectl create -f elasticsearch_svc.yaml $ kubectl get services --namespace=kube-logging NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE elasticsearch ClusterIP None <none> 9200/TCP,9300/TCP 13s
5.2.2 创建PV 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 $ cat > elasticsearch_pv.ymal <<EOF apiVersion: v1 kind: PersistentVolume metadata: name: nfspv1 namespace: kube-logging spec: capacity: storage: 1Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain storageClassName: nfs nfs: path: /nfs server: 192.168.31.200 EOF $ kubectl create -f elasticsearch_pv.ymal $ kubectl get pv -n kube-logging NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE nfspv1 1Gi RWO Retain Available nfs 18s
5.2.3 安装ES 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 $ cat > elasticsearch_statefulset.yaml <<EOF apiVersion: apps/v1 kind: StatefulSet metadata: name: es-cluster namespace: kube-logging spec: serviceName: elasticsearch #replicas: 3 replicas: 1 selector: matchLabels: app: elasticsearch template: metadata: labels: app: elasticsearch spec: containers: - name: elasticsearch image: docker.elastic.co/elasticsearch/elasticsearch:7.2.0 resources: limits: #cpu: 1000m cpu: 400m requests: cpu: 100m ports: - containerPort: 9200 name: rest protocol: TCP - containerPort: 9300 name: inter-node protocol: TCP volumeMounts: - name: data mountPath: /usr/share/elasticsearch/data env: - name: cluster.name value: k8s-logs - name: node.name valueFrom: fieldRef: fieldPath: metadata.name - name: discovery.type # test-bed value: single-node #- name: discovery.seed_hosts #value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch" #- name: cluster.initial_master_nodes #value: "es-cluster-0,es-cluster-1,es-cluster-2" - name: ES_JAVA_OPTS #value: "-Xms512m -Xmx512m" value: "-Xms256m -Xmx256m" initContainers: - name: fix-permissions image: busybox command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"] securityContext: privileged: true volumeMounts: - name: data mountPath: /usr/share/elasticsearch/data - name: increase-vm-max-map image: busybox command: ["sysctl", "-w", "vm.max_map_count=262144"] securityContext: privileged: true - name: increase-fd-ulimit image: busybox command: ["sh", "-c", "ulimit -n 65536"] securityContext: privileged: true volumeClaimTemplates: - metadata: name: data labels: app: elasticsearch spec: accessModes: [ "ReadWriteOnce" ] storageClassName: nfs resources: requests: storage: 1Gi EOF $ kubectl create -f elasticsearch_statefulset.yaml $ kubectl rollout status sts/es-cluster --namespace=kube-logging $ kubectl get pod -n kube-logging NAME READY STATUS RESTARTS AGE es-cluster-0 1/1 Running 0 59s $ kubectl logs -f es-cluster-0 -n kube-logging $ kubectl port-forward es-cluster-0 9200:9200 --namespace=kube-logging $ curl http://localhost:9200/_cluster/state?pretty
5.3 Kibana 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 $ cat > kibana.yaml <<EOF apiVersion: v1 kind: Service metadata: name: kibana namespace: kube-logging labels: app: kibana spec: type: NodePort ports: - port: 5601 selector: app: kibana --- apiVersion: apps/v1 kind: Deployment metadata: name: kibana namespace: kube-logging labels: app: kibana spec: replicas: 1 selector: matchLabels: app: kibana template: metadata: labels: app: kibana spec: containers: - name: kibana image: docker.elastic.co/kibana/kibana:7.2.0 resources: limits: cpu: 1000m requests: cpu: 100m env: - name: ELASTICSEARCH_URL value: http://elasticsearch:9200 ports: - containerPort: 5601 EOF $ kubectl create -f kibana.yaml $ kubectl rollout status deployment/kibana --namespace=kube-logging $ kubectl get pod -n kube-logging NAME READY STATUS RESTARTS AGE es-cluster-0 1/1 Running 0 13m kibana-5749b5778b-zvtwn 1/1 Running 0 4m33s $ kubectl get svc -n kube-logging NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE elasticsearch ClusterIP None <none> 9200/TCP,9300/TCP 89m kibana NodePort 10.106.103.244 <none> 5601:30750/TCP 8s $ curl http://192.168.1.40:30750
5.4 Fluentd 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 $ cat > fluentd.yaml <<EOF apiVersion: v1 kind: ServiceAccount metadata: name: fluentd namespace: kube-logging labels: app: fluentd --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: fluentd labels: app: fluentd rules: - apiGroups: - "" resources: - pods - namespaces verbs: - get - list - watch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: fluentd roleRef: kind: ClusterRole name: fluentd apiGroup: rbac.authorization.k8s.io subjects: - kind: ServiceAccount name: fluentd namespace: kube-logging --- apiVersion: apps/v1 kind: DaemonSet metadata: name: fluentd namespace: kube-logging labels: app: fluentd spec: selector: matchLabels: app: fluentd template: metadata: labels: app: fluentd spec: serviceAccount: fluentd serviceAccountName: fluentd tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule containers: - name: fluentd image: fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1 env: - name: FLUENT_ELASTICSEARCH_HOST value: "elasticsearch.kube-logging.svc.cluster.local" - name: FLUENT_ELASTICSEARCH_PORT value: "9200" - name: FLUENT_ELASTICSEARCH_SCHEME value: "http" - name: FLUENTD_SYSTEMD_CONF value: disable resources: limits: #memory: 512Mi memory: 256Mi requests: cpu: 100m memory: 200Mi volumeMounts: - name: varlog mountPath: /var/log - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true terminationGracePeriodSeconds: 30 volumes: - name: varlog hostPath: path: /var/log - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers EOF $ kubectl create -f fluentd.yaml $ kubectl get ds -n kube-logging NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE fluentd 2 2 2 2 2 <none> 27s
5.5 Kibana 页面
5.6 测试 创建一个 Pod:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 $ cat > counter.yaml <<EOF apiVersion: v1 kind: Pod metadata: name: counter spec: containers: - name: count image: busybox args: [/bin/sh, -c, 'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 1; done'] EOF $ kubectl create -f counter.yaml
6. 补充:Port说明: Pod Template中的ports:
Service 中的 ports:
port: 监听请求,接收端口,绑定在ClusterIP上
targetPort: 指定Pod的接收端口,与containerPort绑定
nodePort: 类型为NodeType时,绑定在NodeIP上,未指定则随机给一个