Eli's Blog

1. 环境准备

1.1 安装规划

角色 IP 组件
k8s-master1 192.168.80.45 etcd, api-server, controller-manager, scheduler, docker
k8s-node01 192.168.80.46 etcd, kubelet, kube-proxy, docker
k8s-node02 192.168.80.47 etcd, kubelet, kube-proxy, docker

软件版本:

软件 版本 备注
OS Ubuntu 16.04.6 LTS
Kubernetes 1.21.4
Etcd v3.5.0
Docker 19.03.9

1.2 系统设置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# 1. 修改主机名
hostnamectl set-hostname k8s-master
hostnamectl set-hostname k8s-node01
hostnamectl set-hostname k8s-node02

# 2. 主机名解析
cat >> /etc/hosts <<EOF
192.168.80.45 k8s-master
192.168.80.46 k8s-node01
192.168.80.47 k8s-node02
EOF

# 3. 禁用 swap
swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab

# 4. 将桥接的IPv4流量传递到iptables的链
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system

# 5. 时间同步
apt install ntpdate -y
ntpdate ntp1.aliyun.com

crontab -e
*/30 * * * * /usr/sbin/ntpdate-u ntp1.aliyun.com >> /var/log/ntpdate.log 2>&1

2. 安装 docker

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install

# 1. 下载安装包
wget https://download.docker.com/linux/static/stable/x86_64/docker-19.03.9.tgz
tar zxvf docker-19.03.9.tgz
mv docker/* /usr/bin
docker version

# 2. 开机启动配置
cat > /lib/systemd/system/docker.service << EOF
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP \$MAINPID
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TimeoutStartSec=0
Delegate=yes
KillMode=process
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s

[Install]
WantedBy=multi-user.target
EOF

# 3. 启动
systemctl daemon-reload
systemctl start docker
systemctl status docker
systemctl enable docker

3. TLS 证书

3.1 证书工具

1
2
3
4
5
6
7
8
9
10
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install

wget https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssl_1.5.0_linux_amd64
wget https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssljson_1.5.0_linux_amd64
wget https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssl-certinfo_1.5.0_linux_amd64

mv cfssl_1.5.0_linux_amd64 /usr/local/bin/cfssl
mv cfssljson_1.5.0_linux_amd64 /usr/local/bin/cfssljson
mv cfssl-certinfo_1.5.0_linux_amd64 /usr/bin/cfssl-certinfo
chmod /usr/local/bin/cfssl*

3.2 证书归类

生成的 CA 证书和秘钥文件如下:

组件 证书 密钥 备注
etcd ca.pem、etcd.pem etcd-key.pem
apiserver ca.pem、apiserver.pem apiserver-key.pem
controller-manager ca.pem、kube-controller-manager.pem ca-key.pem、kube-controller-manager-key.pem kubeconfig
scheduler ca.pem、kube-scheduler.pem kube-scheduler-key.pem kubeconfig
kubelet ca.pem kubeconfig+token
kube-proxy ca.pem、kube-proxy.pem kube-proxy-key.pem kubeconfig
kubectl ca.pem、admin.pem admin-key.pem

3.3 CA 证书

CA: Certificate Authority

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
mkdir -p /root/ssl && cd /root/ssl

# 1. CA 配置文件
cat > ca-config.json <<EOF
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"kubernetes": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "87600h"
}
}
}
}
EOF

# 2. CA 证书签名请求文件
cat > ca-csr.json <<EOF
{
"CN": "kubernetes",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "k8s",
"OU": "System"
}
],
"ca": {
"expiry": "87600h"
}
}
EOF

# 3. 生成CA证书和密钥
cfssl gencert -initca ca-csr.json | cfssljson -bare ca

ls ca*
ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem

3.4 etcd 证书

注意:hosts 中的IP地址,分别指定了 etcd 集群的主机 IP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# 1. 证书签名请求文件
cat > etcd-csr.json <<EOF
{
"CN": "etcd",
"hosts": [
"127.0.0.1",
"localhost",
"192.168.80.45",
"192.168.80.46",
"192.168.80.47"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "etcd",
"OU": "System"
}
]
}
EOF

# 2. 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes etcd-csr.json | cfssljson -bare etcd

3.5 kube-apiserver 证书

注意:hosts 中的IP地址,分别指定了 kubernetes master 集群的主机 IP 和 kubernetes 服务的服务 IP(一般是 kube-apiserver 指定的 service-cluster-ip-range 网段的第一个IP,如 10.96.0.1)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# 1. 证书签名请求文件
cat > kube-apiserver-csr.json <<EOF
{
"CN": "kubernetes",
"hosts": [
"127.0.0.1",
"localhost",
"192.168.80.1",
"192.168.80.2",
"192.168.80.45",
"192.168.80.46",
"192.168.80.47",
"10.96.0.1",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF

# 2. 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-apiserver-csr.json | cfssljson -bare kube-apiserver

3.6 kube-controller-manager 证书

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# 1. 证书签名请求文件
cat > kube-controller-manager-csr.json <<EOF
{
"CN": "system:kube-controller-manager",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "system:masters",
"OU": "System"
}
]
}
EOF

# 2. 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager

3.8 kube-scheduler 证书

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# 1. 证书签名请求文件
cat > kube-scheduler-csr.json << EOF
{
"CN": "system:kube-scheduler",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "system:masters",
"OU": "System"
}
]
}
EOF

# 2. 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler

3.9 admin 证书

  • 后续 kube-apiserver 使用 RBAC 对客户端(如 kubeletkube-proxyPod)请求进行授权;
  • kube-apiserver 预定义了一些 RBAC 使用的 RoleBindings,如 cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予了调用kube-apiserver所有 API的权限;
  • O 指定该证书的 Group 为 system:masterskubelet 使用该证书访问 kube-apiserver 时 ,由于证书被 CA 签名,所以认证通过,同时由于证书用户组为经过预授权的 system:masters,所以被授予访问所有 API 的权限;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# 1. 证书签名请求文件
cat > admin-csr.json <<EOF
{
"CN": "admin",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "system:masters",
"OU": "System"
}
]
}
EOF

# 2. 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin

ls admin*
admin.csr admin-csr.json admin-key.pem admin.pem

搭建完 kubernetes 集群后,可以通过命令: kubectl get clusterrolebinding cluster-admin -o yaml ,查看到 clusterrolebinding cluster-admin 的 subjects 的 kind 是 Group,name 是 system:mastersroleRef 对象是 ClusterRole cluster-admin。 即 system:masters Group 的 user 或者 serviceAccount 都拥有 cluster-admin 的角色。 因此在使用 kubectl 命令时候,才拥有整个集群的管理权限。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
kubectl get clusterrolebinding cluster-admin -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
creationTimestamp: 2017-04-11T11:20:42Z
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: cluster-admin
resourceVersion: "52"
selfLink: /apis/rbac.authorization.k8s.io/v1/clusterrolebindings/cluster-admin
uid: e61b97b2-1ea8-11e7-8cd7-f4e9d49f8ed0
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:masters

3.10 kube-proxy 证书

  • CN 指定该证书的 User 为 system:kube-proxy
  • kube-apiserver 预定义的 RoleBinding system:node-proxier 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# 1. 证书签名请求文件
cat > kube-proxy-csr.json <<EOF
{
"CN": "system:kube-proxy",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF

# 2. 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy

3.11 证书信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
cfssl-certinfo -cert apiserver.pem
{
"subject": {
"common_name": "kubernetes",
"country": "CN",
"organization": "k8s",
"organizational_unit": "System",
"locality": "BeiJing",
"province": "BeiJing",
"names": [
"CN",
"BeiJing",
"BeiJing",
"k8s",
"System",
"kubernetes"
]
},
"issuer": {
"common_name": "kubernetes",
"country": "CN",
"organization": "k8s",
"organizational_unit": "System",
"locality": "BeiJing",
"province": "BeiJing",
"names": [
"CN",
"BeiJing",
"BeiJing",
"k8s",
"System",
"kubernetes"
]
},
"serial_number": "275867496157961939649344217740970264800633176866",
"sans": [
"localhost",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local",
"127.0.0.1",
"192.168.80.1",
"192.168.80.2",
"192.168.80.45",
"192.168.80.46",
"192.168.80.47",
"10.96.0.1"
],
"not_before": "2021-06-09T05:20:00Z",
"not_after": "2031-06-07T05:20:00Z",
"sigalg": "SHA256WithRSA",
"authority_key_id": "",
"subject_key_id": "E3:84:0F:9C:00:07:4A:8F:5C:B2:35:45:A0:50:4D:3E:9D:C0:B4:D0",
"pem": "-----BEGIN CERTIFICATE-----\nMIIEezCCA2OgAwIBAgIUMFJTjEXe9sDDDpPXcAiUBt5+QyIwDQYJKoZIhvcNAQEL\nBQAwZTELMAkGA1UEBhMCQ04xEDAOBgNVBAgTB0JlaUppbmcxEDAOBgNVBAcTB0Jl\naUppbmcxDDAKBgNVBAoTA2s4czEPMA0GA1UECxMGU3lzdGVtMRMwEQYDVQQDEwpr\ndWJlcm5ldGVzMB4XDTIxMDYwOTA1MjAwMFoXDTMxMDYwNzA1MjAwMFowZTELMAkG\nA1UEBhMCQ04xEDAOBgNVBAgTB0JlaUppbmcxEDAOBgNVBAcTB0JlaUppbmcxDDAK\nBgNVBAoTA2s4czEPMA0GA1UECxMGU3lzdGVtMRMwEQYDVQQDEwprdWJlcm5ldGVz\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAw0BpjZQNEd6Oqu8ubEWG\nhbdwJecOTCfdbY+VLIKEm0Tys8ZBlu7OrtZ8Rj5OAZTXil0ZJz+hvHo8YTNJJ16g\njHV88VSpfoXD5DE59PITSFwfY1lWHVctC3Ddo9CM9cU9Ty+Kf29XcrLbc/VNGZTB\ncvKXoM3b6NkBKOdKphVjUvafhKC6ls2ac5uub3uqZTpPgBs/1PvINKNZkP5U6lUV\noTBMAT+qbQ9aggA+bA+WegL3jHU78ngo1XMnsb1HfAjwKDOf66smNJ/K+YjD+Cul\ngjpyqOQKGlz5xqXUcBgIMO9djI4f5hvaMsSje1aSJ/oh5AfQbxQsGjajlS80ED08\nxwIDAQABo4IBITCCAR0wDgYDVR0PAQH/BAQDAgWgMB0GA1UdJQQWMBQGCCsGAQUF\nBwMBBggrBgEFBQcDAjAMBgNVHRMBAf8EAjAAMB0GA1UdDgQWBBTjhA+cAAdKj1yy\nNUWgUE0+ncC00DCBvgYDVR0RBIG2MIGzgglsb2NhbGhvc3SCCmt1YmVybmV0ZXOC\nEmt1YmVybmV0ZXMuZGVmYXVsdIIWa3ViZXJuZXRlcy5kZWZhdWx0LnN2Y4Iea3Vi\nZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVygiRrdWJlcm5ldGVzLmRlZmF1bHQu\nc3ZjLmNsdXN0ZXIubG9jYWyHBH8AAAGHBMCoUAGHBMCoUAKHBMCoUC2HBMCoUC6H\nBMCoUC+HBAr+AAEwDQYJKoZIhvcNAQELBQADggEBAG+RUKp4cxz4EOqmAPiczkl2\nHciAg01RbCavoLoUWmoDDAQf7PIhQF2pLewFCwR5w6SwvCJAVdg+eHdefJ2MBtJr\nKQgbmEOBXd4Z5ZqBeSP6ViHvb1pKtRSldznZLfxjsVd0bN3na/JmS4TZ90SqLLtL\nN4CgGfTs2AfrtbtWIqewDMS9aWjBK8VePzLBmsdLddD4WYQOnl+QjdrX9bbqYRCG\nQo3CKvJ3JZqh6AJHcgKsm0702uMU/TCJwe1M8I8SpYrwA74uCBy3O9jXed1rZlrp\nRVURB6Ro7SMLjiadTJyf6AbLPMmZcPKHhZ1XG07q8Od2Kd+KVx1PxF3et6OOteE=\n-----END CERTIFICATE-----\n"
}

3.12 分发证书

1
2
mkdir -p /etc/kubernetes/pki
cp *.pem /etc/kubernetes/pki

4. 安装 etcd (多节点)

4.1 节点 etcd-1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install

# 1. 下载并安装
wget https://github.com/etcd-io/etcd/releases/download/v3.5.0/etcd-v3.5.0-linux-amd64.tar.gz
tar zxvf etcd-v3.5.0-linux-amd64.tar.gz

mv etcd-v3.5.0-linux-amd64/{etcd,etcdctl} /usr/bin/

# 2. 配置文件
mkdir -p /etc/etcd
cat > /etc/etcd/etcd.conf.yml << EOF
# This is the configuration file for the etcd server.

# Human-readable name for this member.
name: 'etcd-1'

# Path to the data directory.
data-dir: /var/lib/etcd/default.etcd

# Path to the dedicated wal directory.
wal-dir:

# Number of committed transactions to trigger a snapshot to disk.
snapshot-count: 10000

# Time (in milliseconds) of a heartbeat interval.
heartbeat-interval: 100

# Time (in milliseconds) for an election to timeout.
election-timeout: 1000

# Raise alarms when backend size exceeds the given quota. 0 means use the
# default quota.
quota-backend-bytes: 0

# List of comma separated URLs to listen on for peer traffic.
listen-peer-urls: 'https://localhost:2380,https://192.168.80.45:2380'

# List of comma separated URLs to listen on for client traffic.
listen-client-urls: 'https://localhost:2379,https://192.168.80.45:2379'

# Maximum number of snapshot files to retain (0 is unlimited).
max-snapshots: 5

# Maximum number of wal files to retain (0 is unlimited).
max-wals: 5

# Comma-separated white list of origins for CORS (cross-origin resource sharing).
cors:

# List of this member's peer URLs to advertise to the rest of the cluster.
# The URLs needed to be a comma-separated list.
initial-advertise-peer-urls: 'https://localhost:2380,https://192.168.80.45:2380'

# List of this member's client URLs to advertise to the public.
# The URLs needed to be a comma-separated list.
advertise-client-urls: 'https://localhost:2379,https://192.168.80.45:2379'

# Discovery URL used to bootstrap the cluster.
discovery:

# Valid values include 'exit', 'proxy'
discovery-fallback: 'proxy'

# HTTP proxy to use for traffic to discovery service.
discovery-proxy:

# DNS domain used to bootstrap initial cluster.
discovery-srv:

# Initial cluster configuration for bootstrapping.
initial-cluster: 'etcd-1=https://192.168.80.45:2380,etcd-2=https://192.168.80.46:2380,etcd-3=https://192.168.80.47:2380'

# Initial cluster token for the etcd cluster during bootstrap.
initial-cluster-token: 'etcd-cluster'

# Initial cluster state ('new' or 'existing').
initial-cluster-state: 'new'

# Reject reconfiguration requests that would cause quorum loss.
strict-reconfig-check: false

# Accept etcd V2 client requests
enable-v2: true

# Enable runtime profiling data via HTTP server
enable-pprof: true

# Valid values include 'on', 'readonly', 'off'
proxy: 'off'

# Time (in milliseconds) an endpoint will be held in a failed state.
proxy-failure-wait: 5000

# Time (in milliseconds) of the endpoints refresh interval.
proxy-refresh-interval: 30000

# Time (in milliseconds) for a dial to timeout.
proxy-dial-timeout: 1000

# Time (in milliseconds) for a write to timeout.
proxy-write-timeout: 5000

# Time (in milliseconds) for a read to timeout.
proxy-read-timeout: 0

client-transport-security:
# Path to the client server TLS cert file.
cert-file: /etc/kubernetes/pki/etcd.pem

# Path to the client server TLS key file.
key-file: /etc/kubernetes/pki/etcd-key.pem

# Enable client cert authentication.
client-cert-auth: true

# Path to the client server TLS trusted CA cert file.
trusted-ca-file: /etc/kubernetes/pki/ca.pem

# Client TLS using generated certificates
auto-tls: true

peer-transport-security:
# Path to the peer server TLS cert file.
cert-file: /etc/kubernetes/pki/etcd.pem

# Path to the peer server TLS key file.
key-file: /etc/kubernetes/pki/etcd-key.pem

# Enable peer client cert authentication.
client-cert-auth: true

# Path to the peer server TLS trusted CA cert file.
trusted-ca-file: /etc/kubernetes/pki/ca.pem

# Peer TLS using generated certificates.
auto-tls: true

# Enable debug-level logging for etcd.
log-level: debug

logger: zap

# Specify 'stdout' or 'stderr' to skip journald logging even when running under systemd.
log-outputs: [stderr]

# Force to create a new one member cluster.
force-new-cluster: false

auto-compaction-mode: periodic
auto-compaction-retention: "1"
EOF

# 3. 开机启动
cat > /lib/systemd/system/etcd.service << EOF
[Unit]
Description=Etcd Server
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
ExecStart=/usr/bin/etcd --config-file=/etc/etcd/etcd.conf.yml
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

4.2 其他节点

1
2
3
4
5
6
# 1. 解压克隆文件
sudo -i
cd / && mv /home/ubuntu/etcd-clone.tar / && tar xvf etcd-clone.tar && rm -f etcd-clone.tar

# 2. 修改配置文件, 改成各自对应的IP和名称
vi /etc/etcd/etcd.conf.yml

4.3 启动

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# 1. 开机启动
systemctl daemon-reload
systemctl start etcd
systemctl status etcd
systemctl enable etcd

# 2. 运行状态
etcdctl member list --cacert=/etc/kubernetes/pki/ca.pem --cert=/etc/kubernetes/pki/etcd.pem --key=/etc/kubernetes/pki/etcd-key.pem --write-out=table
+------------------+---------+--------+----------------------------+----------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+--------+----------------------------+----------------------------+------------+
| 46bc5ad35e418584 | started | etcd-1 | https://192.168.80.45:2380 | https://192.168.80.45:2379 | false |
| 8f347c1327049bc8 | started | etcd-3 | https://192.168.80.47:2380 | https://192.168.80.47:2379 | false |
| b01e7a29099f3eb8 | started | etcd-2 | https://192.168.80.46:2380 | https://192.168.80.46:2379 | false |
+------------------+---------+--------+----------------------------+----------------------------+------------+

# 3. 健康状态
etcdctl endpoint health --cacert=/etc/kubernetes/pki/ca.pem --cert=/etc/kubernetes/pki/etcd.pem --key=/etc/kubernetes/pki/etcd-key.pem --cluster --write-out=table
+----------------------------+--------+-------------+-------+
| ENDPOINT | HEALTH | TOOK | ERROR |
+----------------------------+--------+-------------+-------+
| https://192.168.80.47:2379 | true | 20.973639ms | |
| https://192.168.80.46:2379 | true | 29.842299ms | |
| https://192.168.80.45:2379 | true | 30.564766ms | |
+----------------------------+--------+-------------+-------+

4. 安装 etcd (单节点)

4.1 节点 etcd-1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install

# 1. 下载并安装
wget https://github.com/etcd-io/etcd/releases/download/v3.5.0/etcd-v3.5.0-linux-amd64.tar.gz
tar zxvf etcd-v3.5.0-linux-amd64.tar.gz

mv etcd-v3.5.0-linux-amd64/{etcd,etcdctl} /usr/bin/

# 2. 配置文件
mkdir -p /etc/etcd
cat > /etc/etcd/etcd.conf.yml << EOF
name: 'etcd-1'
data-dir: /var/lib/etcd/default.etcd
wal-dir:
snapshot-count: 10000
heartbeat-interval: 100
election-timeout: 1000
quota-backend-bytes: 0
listen-peer-urls: 'https://localhost:2380,https://192.168.80.45:2380'
listen-client-urls: 'https://localhost:2379,https://192.168.80.45:2379'
max-snapshots: 5
max-wals: 5
cors:
initial-advertise-peer-urls: 'https://localhost:2380,https://192.168.80.45:2380'
advertise-client-urls: 'https://localhost:2379,https://192.168.80.45:2379'
discovery:
discovery-fallback: 'proxy'
discovery-proxy:
discovery-srv:
strict-reconfig-check: false
enable-v2: true
enable-pprof: true
proxy: 'off'
proxy-failure-wait: 5000
proxy-refresh-interval: 30000
proxy-dial-timeout: 1000
proxy-write-timeout: 5000
proxy-read-timeout: 0

client-transport-security:
cert-file: /etc/kubernetes/pki/etcd.pem
key-file: /etc/kubernetes/pki/etcd-key.pem
client-cert-auth: true
trusted-ca-file: /etc/kubernetes/pki/ca.pem
auto-tls: true

peer-transport-security:
cert-file: /etc/kubernetes/pki/etcd.pem
key-file: /etc/kubernetes/pki/etcd-key.pem
client-cert-auth: true
trusted-ca-file: /etc/kubernetes/pki/ca.pem
auto-tls: true

log-level: debug
logger: zap
log-outputs: [stderr]
force-new-cluster: false
auto-compaction-mode: periodic
auto-compaction-retention: "1"
EOF

# 3. 开机启动
cat > /lib/systemd/system/etcd.service << EOF
[Unit]
Description=Etcd Server
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
ExecStart=/usr/bin/etcd --config-file=/etc/etcd/etcd.conf.yml
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

# 4. 启动
systemctl daemon-reload
systemctl start etcd
systemctl status etcd
systemctl enable etcd

# 5. 运行状态
etcdctl member list --cacert=/etc/kubernetes/pki/ca.pem --cert=/etc/kubernetes/pki/etcd.pem --key=/etc/kubernetes/pki/etcd-key.pem --write-out=table

# 6. 健康状态
etcdctl endpoint health --cacert=/etc/kubernetes/pki/ca.pem --cert=/etc/kubernetes/pki/etcd.pem --key=/etc/kubernetes/pki/etcd-key.pem --cluster --write-out=table

5. Master 节点

kubernetes master 节点组件:

  • kube-apiserver
  • kube-scheduler
  • kube-controller-manager
  • kubelet (非必须,但必要)
  • kube-proxy(非必须,但必要)

5.1 安装准备

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md

1
2
3
4
5
6
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install
wget https://dl.k8s.io/v1.21.4/kubernetes-server-linux-amd64.tar.gz
tar zxvf kubernetes-server-linux-amd64.tar.gz

cd kubernetes/server/bin
cp kube-apiserver kube-scheduler kube-controller-manager kubectl kubelet kube-proxy /usr/bin

5.2 apiserver

5.2.1 TLS Bootstrapping Token

启用 TLS Bootstrapping 机制:

TLS Bootstraping:Master apiserver启用TLS认证后,Node节点kubelet和kube-proxy要与kube-apiserver进行通信,必须使用CA签发的有效证书才可以,当Node节点很多时,这种客户端证书颁发需要大量工作,同样也会增加集群扩展复杂度。为了简化流程,Kubernetes引入了TLS bootstraping机制来自动颁发客户端证书,kubelet会以一个低权限用户自动向apiserver申请证书,kubelet的证书由apiserver动态签署。所以强烈建议在Node上使用这种方式,目前主要用于kubelet,kube-proxy还是由我们统一颁发一个证书。

TLS bootstraping 工作流程:

img

1
2
3
4
5
6
BOOTSTRAP_TOKEN=$(head -c 16 /dev/urandom | od -An -t x | tr -d ' ')

# 格式:token,用户名,UID,用户组
cat > /etc/kubernetes/token.csv <<EOF
${BOOTSTRAP_TOKEN},kubelet-bootstrap,10001,"system:node-bootstrapper"
EOF

5.2.2 开机启动

--service-cluster-ip-range=10.96.0.0/16: Service IP 段

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
KUBE_APISERVER_OPTS="--enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \
--anonymous-auth=false \
--bind-address=192.168.80.45 \
--secure-port=6443 \
--advertise-address=192.168.80.45 \
--authorization-mode=Node,RBAC \
--runtime-config=api/all=true \
--enable-bootstrap-token-auth \
--service-cluster-ip-range=10.96.0.0/16 \
--token-auth-file=/etc/kubernetes/token.csv \
--service-node-port-range=30000-50000 \
--tls-cert-file=/etc/kubernetes/pki/kube-apiserver.pem \
--tls-private-key-file=/etc/kubernetes/pki/kube-apiserver-key.pem \
--client-ca-file=/etc/kubernetes/pki/ca.pem \
--kubelet-client-certificate=/etc/kubernetes/pki/kube-apiserver.pem \
--kubelet-client-key=/etc/kubernetes/pki/kube-apiserver-key.pem \
--service-account-key-file=/etc/kubernetes/pki/ca-key.pem \
--service-account-signing-key-file=/etc/kubernetes/pki/ca-key.pem \
--service-account-issuer=https://kubernetes.default.svc.cluster.local \
--etcd-cafile=/etc/kubernetes/pki/ca.pem \
--etcd-certfile=/etc/kubernetes/pki/etcd.pem \
--etcd-keyfile=/etc/kubernetes/pki/etcd-key.pem \
--etcd-servers=https://192.168.80.45:2379 \
--allow-privileged=true \
--audit-log-maxage=30 \
--audit-log-maxbackup=3 \
--audit-log-maxsize=100 \
--audit-log-path=/var/log/kubernetes/kube-apiserver-audit.log \
--event-ttl=1h \
--alsologtostderr=true \
--logtostderr=false \
--log-dir=/var/log/kubernetes \
--v=2"

cat > /lib/systemd/system/kube-apiserver.service << EOF
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes

[Service]
ExecStart=/usr/bin/kube-apiserver $KUBE_APISERVER_OPTS
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

# 2. 启动
systemctl daemon-reload
systemctl start kube-apiserver
systemctl status kube-apiserver
systemctl enable kube-apiserver

5.2.3 kubectl 管理集群

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
mkdir -p /root/.kube

KUBE_CONFIG=/root/.kube/config
KUBE_APISERVER="https://192.168.80.45:6443"

kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/pki/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials cluster-admin \
--client-certificate=/etc/kubernetes/pki/admin.pem \
--client-key=/etc/kubernetes/pki/admin-key.pem \
--embed-certs=true \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
--cluster=kubernetes \
--user=cluster-admin \
--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}

5.2.4 授权 kubelet-bootstrap 用户允许请求证书

防止错误:failed to run Kubelet: cannot create certificate signing request: certificatesigningrequests.certificates.k8s.io is forbidden: User "kubelet-bootstrap" cannot create resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope

1
2
3
kubectl create clusterrolebinding kubelet-bootstrap \
--clusterrole=system:node-bootstrapper \
--user=kubelet-bootstrap

5.2.5 授权 apiserver 访问 kubelet

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install

cat > apiserver-to-kubelet-rbac.yaml << EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:kube-apiserver-to-kubelet
rules:
- apiGroups:
- ""
resources:
- nodes/proxy
- nodes/stats
- nodes/log
- nodes/spec
- nodes/metrics
- pods/log
verbs:
- "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:kube-apiserver
namespace: ""
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:kube-apiserver-to-kubelet
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: User
name: kubernetes
EOF

kubectl apply -f apiserver-to-kubelet-rbac.yaml

5.3 controller-manager

5.3.1 kubeconfig

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
KUBE_CONFIG="/etc/kubernetes/kube-controller-manager.kubeconfig"
KUBE_APISERVER="https://192.168.80.45:6443"

kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/pki/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials kube-controller-manager \
--client-certificate=/etc/kubernetes/pki/kube-controller-manager.pem \
--client-key=/etc/kubernetes/pki/kube-controller-manager-key.pem \
--embed-certs=true \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
--cluster=kubernetes \
--user=kube-controller-manager \
--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}

5.3.2 开机启动

--cluster-cidr=10.244.0.0/16: Pod IP 段

--service-cluster-ip-range=10.96.0.0/16: Service IP 段

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
KUBE_CONTROLLER_MANAGER_OPTS="--logtostderr=false \
--v=2 \
--log-dir=/var/log/kubernetes \
--leader-elect=true \
--kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \
--bind-address=127.0.0.1 \
--allocate-node-cidrs=true \
--cluster-cidr=10.244.0.0/16 \
--service-cluster-ip-range=10.96.0.0/16 \
--cluster-signing-cert-file=/etc/kubernetes/pki/ca.pem \
--cluster-signing-key-file=/etc/kubernetes/pki/ca-key.pem \
--root-ca-file=/etc/kubernetes/pki/ca.pem \
--service-account-private-key-file=/etc/kubernetes/pki/ca-key.pem \
--cluster-signing-duration=87600h0m0s"


cat > /lib/systemd/system/kube-controller-manager.service << EOF
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes

[Service]
ExecStart=/usr/bin/kube-controller-manager $KUBE_CONTROLLER_MANAGER_OPTS
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl start kube-controller-manager
systemctl status kube-controller-manager
systemctl enable kube-controller-manager

5.4 scheduler

5.4.1 kubeconfig

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
KUBE_CONFIG="/etc/kubernetes/kube-scheduler.kubeconfig"
KUBE_APISERVER="https://192.168.80.45:6443"

kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/pki/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials kube-scheduler \
--client-certificate=/etc/kubernetes/pki/kube-scheduler.pem \
--client-key=/etc/kubernetes/pki/kube-scheduler-key.pem \
--embed-certs=true \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
--cluster=kubernetes \
--user=kube-scheduler \
--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}

5.4.2 开机启动

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
KUBE_SCHEDULER_OPTS="--logtostderr=false \
--v=2 \
--log-dir=/var/log/kubernetes \
--leader-elect \
--kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \
--bind-address=127.0.0.1"


cat > /lib/systemd/system/kube-scheduler.service << EOF
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes

[Service]
ExecStart=/usr/bin/kube-scheduler $KUBE_SCHEDULER_OPTS
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl start kube-scheduler
systemctl status kube-scheduler
systemctl enable kube-scheduler

5.5 kubelet

5.5.1 参数配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
cat > /etc/kubernetes/kubelet-config.yml << EOF
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 0.0.0.0
port: 10250
readOnlyPort: 10255
cgroupDriver: cgroupfs
clusterDNS:
- 10.96.0.2
clusterDomain: cluster.local
failSwapOn: false
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 2m0s
enabled: true
x509:
clientCAFile: /etc/kubernetes/pki/ca.pem
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 5m0s
cacheUnauthorizedTTL: 30s
evictionHard:
imagefs.available: 15%
memory.available: 100Mi
nodefs.available: 10%
nodefs.inodesFree: 5%
maxOpenFiles: 1000000
maxPods: 110
EOF

5.5.2 kubeconfig

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
BOOTSTRAP_TOKEN=$(cat /etc/kubernetes/token.csv | awk -F, '{print $1}')

KUBE_CONFIG="/etc/kubernetes/bootstrap.kubeconfig"
KUBE_APISERVER="https://192.168.80.45:6443"

# 生成 kubelet bootstrap kubeconfig 配置文件
kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/pki/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials "kubelet-bootstrap" \
--token=${BOOTSTRAP_TOKEN} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
--cluster=kubernetes \
--user="kubelet-bootstrap" \
--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}

5.5.3 开机启动

其中:--kubeconfig=/etc/kubernetes/kubelet.kubeconfig 在加入集群时自动生成

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
KUBELET_OPTS="--logtostderr=false \
--v=2 \
--log-dir=/var/log/kubernetes \
--hostname-override=k8s-master1 \
--network-plugin=cni \
--kubeconfig=/etc/kubernetes/kubelet.kubeconfig \
--bootstrap-kubeconfig=/etc/kubernetes/bootstrap.kubeconfig \
--config=/etc/kubernetes/kubelet-config.yml \
--cert-dir=/etc/kubernetes/pki \
--pod-infra-container-image=mirrorgooglecontainers/pause-amd64:3.1"

cat > /lib/systemd/system/kubelet.service << EOF
[Unit]
Description=Kubernetes Kubelet
After=docker.service

[Service]
ExecStart=/usr/bin/kubelet $KUBELET_OPTS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl start kubelet
systemctl status kubelet
systemctl enable kubelet

5.5.4 加入集群

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# 查看kubelet证书请求
kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
node-csr-ghWG-AWFM9sxJbr5A-BIq9puVIRxfFHrQlwDjYbHba8 25s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending

# 批准申请
kubectl certificate approve node-csr-ghWG-AWFM9sxJbr5A-BIq9puVIRxfFHrQlwDjYbHba8

# 再次查看证书
kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
node-csr-ghWG-AWFM9sxJbr5A-BIq9puVIRxfFHrQlwDjYbHba8 53m kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Approved,Issued

# 查看节点(由于网络插件还没有部署,节点会没有准备就绪 NotReady)
kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master1 NotReady <none> 4m8s v1.21.4

5.6 kube-proxy

5.6.1 参数配置

clusterCIDR: 10.96.0.0/16: Service IP 段,与apiserver & controller-manager 的--service-cluster-ip-range 一致

1
2
3
4
5
6
7
8
9
10
cat > /etc/kubernetes/kube-proxy-config.yml << EOF
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
metricsBindAddress: 0.0.0.0:10249
clientConnection:
kubeconfig: /etc/kubernetes/kube-proxy.kubeconfig
hostnameOverride: k8s-master1
clusterCIDR: 10.96.0.0/16
EOF

5.6.2 kubeconfig 文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
KUBE_CONFIG="/etc/kubernetes/kube-proxy.kubeconfig"
KUBE_APISERVER="https://192.168.80.45:6443"

kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/pki/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials kube-proxy \
--client-certificate=/etc/kubernetes/pki/kube-proxy.pem \
--client-key=/etc/kubernetes/pki/kube-proxy-key.pem \
--embed-certs=true \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
--cluster=kubernetes \
--user=kube-proxy \
--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}

5.6.3 开机启动

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
KUBE_PROXY_OPTS="--logtostderr=false \
--v=2 \
--proxy-mode=iptables \
--log-dir=/var/log/kubernetes \
--config=/etc/kubernetes/kube-proxy-config.yml"

cat > /lib/systemd/system/kube-proxy.service << EOF
[Unit]
Description=Kubernetes Proxy
After=network.target

[Service]
ExecStart=/usr/bin/kube-proxy $KUBE_PROXY_OPTS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl start kube-proxy
systemctl status kube-proxy
systemctl enable kube-proxy

5.7 集群管理

5.7.1 集群配置信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
kubectl config view

apiVersion: v1
clusters:
- cluster:
certificate-authority-data: DATA+OMITTED
server: https://192.168.80.45:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: cluster-admin
name: default
current-context: default
kind: Config
preferences: {}
users:
- name: cluster-admin
user:
client-certificate-data: REDACTED
client-key-data: REDACTED

5.7.2 集群状态

1
2
3
4
5
6
7
8
kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-1 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
etcd-0 Healthy {"health":"true"}

5.8 命令补全

1
2
3
4
apt install -y bash-completion
source /usr/share/bash-completion/bash_completion
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc

6. 网络插件

其中涉及的IP段,要与 kube-controller-manager中 “–cluster-cidr” 一致

6.1 CNI Plugins

所有节点都要操作

1
2
3
4
5
mkdir -p $HOME/k8s-install/network && cd $_
wget https://github.com/containernetworking/plugins/releases/download/v0.9.1/cni-plugins-linux-amd64-v0.9.1.tgz

mkdir -p /opt/cni/bin
tar zxvf cni-plugins-linux-amd64-v0.9.1.tgz -C /opt/cni/bin

6.2 calico

Calico是一个纯三层的数据中心网络方案,是目前Kubernetes主流的网络方案。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
mkdir -p $HOME/k8s-install/network && cd $HOME/k8s-install/network

# 1. 下载插件
wget https://docs.projectcalico.org/manifests/calico.yaml

# CIDR的值,与 kube-controller-manager中“--cluster-cidr=10.244.0.0/16” 一致
vi calico.yaml
3680 # The default IPv4 pool to create on startup if none exists. Pod IPs will be
3681 # chosen from this range. Changing this value after installation will have
3682 # no effect. This should fall within `--cluster-cidr`.
3683 - name: CALICO_IPV4POOL_CIDR
3684 value: "10.244.0.0/16"

# 2. 安装网络插件
kubectl apply -f calico.yaml

# 3. 检查是否启动
kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-7f4f5bf95d-tgklk 1/1 Running 0 2m7s
calico-node-fwv5x 1/1 Running 0 2m8s
calico-node-ttt2c 1/1 Running 0 2m8s
calico-node-xjvjf 1/1 Running 0 2m8s

# 4. 节点状态正常
kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master1 Ready master 65m v1.21.4
k8s-node01 Ready node 20m v1.21.4
k8s-node02 Ready node 20m v1.21.4

6.3 flannel

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
mkdir -p $HOME/k8s-install/network && cd $HOME/k8s-install/network

wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f kube-flannel.yml

vi kube-flannel.yml
"Network": "10.244.0.0/16",

kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
kube-flannel-ds-8qnnx 1/1 Running 0 10s
kube-flannel-ds-979lc 1/1 Running 0 16m
kube-flannel-ds-kgmgg 1/1 Running 0 16m

kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master1 Ready master 85m v1.21.4
k8s-node01 Ready node 40m v1.21.4
k8s-node02 Ready node 40m v1.21.4

6.4 ovs-cni

6.4.1 安装 open-vswitch

1
apt install openvswitch-switch

6.4.2 安装 multus

multus创建的crd资源network-attachment-definitions来定义ovs配置

1
2
3
wget https://raw.githubusercontent.com/k8snetworkplumbingwg/multus-cni/master/images/multus-daemonset.yml

kubectl apply -f multus-daemonset.yml

6.4.3 安装 ovs-cni

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
wget https://github.com/k8snetworkplumbingwg/ovs-cni/blob/main/manifests/ovs-cni.yml.in

# 参照配置
cat > ovs-cni.yml <<EOF
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: ovs-cni-amd64
namespace: kube-system
labels:
tier: node
app: ovs-cni
spec:
selector:
matchLabels:
app: ovs-cni
template:
metadata:
labels:
tier: node
app: ovs-cni
annotations:
description: OVS CNI allows users to attach their Pods/VMs to Open vSwitch bridges available on nodes
spec:
serviceAccountName: ovs-cni-marker
hostNetwork: true
nodeSelector:
beta.kubernetes.io/arch: amd64
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
initContainers:
- name: ovs-cni-plugin
image: quay.io/kubevirt/ovs-cni-plugin:latest
command: ['cp', '/ovs', '/host/opt/cni/bin/ovs']
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
volumeMounts:
- name: cnibin
mountPath: /host/opt/cni/bin
containers:
- name: ovs-cni-marker
image: quay.io/kubevirt/ovs-cni-marker:latest
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
args:
- -node-name
- $(NODE_NAME)
- -ovs-socket
- /host/var/run/openvswitch/db.sock
volumeMounts:
- name: ovs-var-run
mountPath: /host/var/run/openvswitch
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
volumes:
- name: cnibin
hostPath:
path: /opt/cni/bin
- name: ovs-var-run
hostPath:
path: /var/run/openvswitch
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: ovs-cni-marker-cr
rules:
- apiGroups:
- ""
resources:
- nodes
- nodes/status
verbs:
- get
- update
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: ovs-cni-marker-crb
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ovs-cni-marker-cr
subjects:
- kind: ServiceAccount
name: ovs-cni-marker
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: ovs-cni-marker
namespace: kube-system
EOF

# 2. 安装 ovs-cni
kubectl apply -f ovs-cni.yaml

6.4.4 网桥

1
2
3
4
5
6
7
8
9
10
11
12
# 1. 创建网桥
ovs-vsctl add-br br1
ovs-vsctl show

# 2. 查看node是否使用了网桥
kubectl describe node k8s-master
...
Capacity:
ovs-cni.network.kubevirt.io/br1: 1k # 默认写死 1k
...
Allocatable:
ovs-cni.network.kubevirt.io/br1: 1k # 默认写死 1k

6.4.5 网络扩展

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
cat > ovs-ipam-net.yml <<EOF
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: ovs-ipam-net
annotations:
k8s.v1.cni.cncf.io/resourceName: ovs-cni.network.kubevirt.io/br1
spec:
config: '{
"cniVersion": "0.3.1",
"type": "ovs",
"bridge": "br1",
"vlan": 100,
"ipam": {
"type": "static"
}
}'
EOF

kubectl apply -f ovs-ipam-net.yml

6.4.6 验证

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# 1. 创建pod
cat > ovs-test.yml <<EOF
apiVersion: v1
kind: Pod
metadata:
name: samplepod-1
annotations:
k8s.v1.cni.cncf.io/networks: '[
{
"name": "ovs-ipam-net",
"ips": ["10.10.10.1/24"]
}
]'
spec:
containers:
- name: pod-1
command: ["sleep", "99999"]
image: alpine
---
apiVersion: v1
kind: Pod
metadata:
name: samplepod-2
annotations:
k8s.v1.cni.cncf.io/networks: '[
{
"name": "ovs-ipam-net",
"ips": ["10.10.10.2/24"]
}
]'
spec:
containers:
- name: pod-2
command: ["sleep", "99999"]
image: alpine
EOF

# 2. zhic
kubectl apply -f ovs-test.yml

# 3. 通信测试
kubectl exec -it samplepod-1 -- ping 10.10.10.2 -c 5
PING 10.10.10.2 (10.10.10.2): 56 data bytes
64 bytes from 10.10.10.2: seq=0 ttl=127 time=10.266 ms
64 bytes from 10.10.10.2: seq=1 ttl=127 time=7.423 ms
64 bytes from 10.10.10.2: seq=2 ttl=127 time=7.265 ms
64 bytes from 10.10.10.2: seq=3 ttl=127 time=14.498

7. Node 节点

Kubernetes node节点组件:

  • kubelet
  • kube-proxy

6.1 克隆准备 (master节点执行)

1
2
3
4
5
6
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install

tar cvf worker-node-clone.tar /usr/bin/{kubelet,kube-proxy} /lib/systemd/system/{kubelet,kube-proxy}.service /etc/kubernetes/kubelet* /etc/kubernetes/kube-proxy* /etc/kubernetes/pki /etc/kubernetes/bootstrap.kubeconfig

scp worker-node-clone.tar ubuntu@192.168.80.46:/home/ubuntu
scp worker-node-clone.tar ubuntu@192.168.80.47:/home/ubuntu

6.2 克隆节点

1
2
3
4
5
6
7
8
cd / && mv /home/ubuntu/worker-node-clone.tar / && tar xvf worker-node-clone.tar && rm -f worker-node-clone.tar

# 删除证书申请审批后自动生成的文件,后面重新生成
rm -f /etc/kubernetes/kubelet.kubeconfig
rm -f /etc/kubernetes/pki/kubelet*

# 日志目录
mkdir -p /var/log/kubernetes

6.3 修改配置

按实际节点名称修改

1
2
3
4
5
6
7
# kubelet
vi /lib/systemd/system/kubelet.service
--hostname-override=k8s-node01

# kube-proxy
vi /etc/kubernetes/kube-proxy-config.yml
hostnameOverride: k8s-node01

6.4 开机启动

1
2
3
4
systemctl daemon-reload
systemctl start kubelet kube-proxy
systemctl status kubelet kube-proxy
systemctl enable kubelet kube-proxy

6.5 加入集群 (master节点执行)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# 1. 节点信息
kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
node-csr-ghWG-AWFM9sxJbr5A-BIq9puVIRxfFHrQlwDjYbHba8 94m kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Approved,Issued
node-csr-r2GF_8R3zuUe9BCf6eHeijWnzyPDDy-6WQUFOrOAQjA 34s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending
node-csr-wvcKDHm38jQgjyaLiA_G2ycc2Qvmecf_iRRd9IqlSEw 97s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending

# 2. 批准加入
kubectl certificate approve node-csr-r2GF_8R3zuUe9BCf6eHeijWnzyPDDy-6WQUFOrOAQjA
kubectl certificate approve node-csr-wvcKDHm38jQgjyaLiA_G2ycc2Qvmecf_iRRd9IqlSEw

# 3. 集群节点
kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master1 NotReady <none> 45m v1.21.4
k8s-node01 NotReady <none> 6s v1.21.4
k8s-node02 NotReady <none> 10s v1.21.4

# 4. 设置标签,即更改节点角色
kubectl label node k8s-master1 node-role.kubernetes.io/master=
kubectl label node k8s-node01 node-role.kubernetes.io/node=
kubectl label node k8s-node02 node-role.kubernetes.io/node=

kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master1 NotReady master 49m v1.21.4
k8s-node01 NotReady node 3m45s v1.21.4
k8s-node02 NotReady node 3m49s v1.21.4

# 5. 设置污点:是master节点无法创建pod
kubectl taint nodes k8s-master1 node-role.kubernetes.io/master=:NoSchedule

kubectl describe node k8s-master1
Taints: node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoSchedule

8. Addons

8.1 CoreDNS

CoreDNS用于集群内部Service名称解析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
mkdir -p $HOME/k8s-install/coredns && cd $HOME/k8s-install/coredns

wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/coredns.yaml.sed
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/deploy.sh

chmod +x deploy.sh

export CLUSTER_DNS_SVC_IP="10.96.0.2"
export CLUSTER_DNS_DOMAIN="cluster.local"

./deploy.sh -i ${CLUSTER_DNS_SVC_IP} -d ${CLUSTER_DNS_DOMAIN} | kubectl apply -f -

# 查询状态
kubectl get pods -n kube-system | grep coredns
coredns-746fcb4bc5-nts2k 1/1 Running 0 6m2s

# 验证 busybox1.33.1有问题
kubectl run -it --rm dns-test --image=busybox:1.28.4 /bin/sh
If you don't see a command prompt, try pressing enter.
/ # nslookup kubernetes
Server: 10.96.0.2
Address: 10.96.0.2:53

Name: kubernetes.default.svc.cluster.local
Address: 10.0.0.1

DNS问题排查:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# dns service
kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.2 <none> 53/UDP,53/TCP,9153/TCP 13m

# endpoints 是否正常
kubectl get endpoints kube-dns -n kube-system
NAME ENDPOINTS AGE
kube-dns 10.244.85.194:53,10.244.85.194:53,10.244.85.194:9153 13m

# coredns 增加解析日志
CoreDNS 配置参数说明:
errors: 输出错误信息到控制台。
health:CoreDNS 进行监控检测,检测地址为 http://localhost:8080/health 如果状态为不健康则让 Pod 进行重启。
ready: 全部插件已经加载完成时,将通过 endpoints 在 8081 端口返回 HTTP 状态 200。
kubernetes:CoreDNS 将根据 Kubernetes 服务和 pod 的 IP 回复 DNS 查询。
prometheus:是否开启 CoreDNS Metrics 信息接口,如果配置则开启,接口地址为 http://localhost:9153/metrics
forward:任何不在Kubernetes 集群内的域名查询将被转发到预定义的解析器 (/etc/resolv.conf)。
cache:启用缓存,30 秒 TTL。
loop:检测简单的转发循环,如果找到循环则停止 CoreDNS 进程。
reload:监听 CoreDNS 配置,如果配置发生变化则重新加载配置。
loadbalance:DNS 负载均衡器,默认 round_robin。

# 编辑 coredns 配置
kubectl edit configmap coredns -n kube-system
apiVersion: v1
data:
Corefile: |
.:53 {
log # new add
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
kind: ConfigMap
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","data":{"Corefile":".:53 {\n errors\n health {\n lameduck 5s\n }\n ready\n kubernetes cluster.local in-addr.arpa ip6.arpa {\n fallthrough in-addr.arpa ip6.arpa\n }\n prometheus :9153\n forward . /etc/resolv.conf {\n max_concurrent 1000\n }\n cache 30\n loop\n reload\n loadbalance\n}\n"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"coredns","namespace":"kube-system"}}
creationTimestamp: "2021-05-13T11:57:45Z"
name: coredns
namespace: kube-system
resourceVersion: "38460"
selfLink: /api/v1/namespaces/kube-system/configmaps/coredns
uid: c62a856d-1fc3-4fe9-b5f1-3ca0dbeb39c1

回滚操作:

1
2
3
4
5
6
7
8
9
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/rollback.sh
chmod +x rollback.sh

export CLUSTER_DNS_SVC_IP="10.96.0.2"
export CLUSTER_DNS_DOMAIN="cluster.local"

./rollback.sh -i ${CLUSTER_DNS_SVC_IP} -d ${CLUSTER_DNS_DOMAIN} | kubectl apply -f -

kubectl delete --namespace=kube-system deployment coredns

8.2 Dashboard

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
mkdir -p $HOME/k8s-install/dashboard && cd $HOME/k8s-install/dashboard

# 1. 下载并安装
curl https://raw.githubusercontent.com/kubernetes/dashboard/v2.2.0/aio/deploy/recommended.yaml -o dashboard.yaml

kubectl apply -f dashboard.yaml

# 2. 检查运行状态
kubectl get pods -n kubernetes-dashboard -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
dashboard-metrics-scraper-79c5968bdc-xkm78 1/1 Running 0 23m 10.244.159.129 k8s-master1 <none> <none>
kubernetes-dashboard-9f9799597-d8g8t 1/1 Running 0 23m 10.244.58.193 k8s-node02 <none> <none>

# 3. 检查服务状态
kubectl get svc -n kubernetes-dashboard -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
dashboard-metrics-scraper ClusterIP 10.96.14.1 <none> 8000/TCP 24m k8s-app=dashboard-metrics-scraper
kubernetes-dashboard ClusterIP 10.96.219.125 <none> 443/TCP 24m k8s-app=kubernetes-dashboard


# 4. 服务改为NodePort方式
kubectl edit svc kubernetes-dashboard -n kubernetes-dashboard
type: ClusterIP => type: NodePort

kubectl get svc -n kubernetes-dashboard -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
dashboard-metrics-scraper ClusterIP 10.96.14.1 <none> 8000/TCP 3h30m k8s-app=dashboard-metrics-scraper
kubernetes-dashboard NodePort 10.96.219.125 <none> 443:31639/TCP 3h30m k8s-app=kubernetes-dashboard

# 5. 创建service account并绑定默认cluster-admin管理员集群角色:
kubectl create serviceaccount dashboard-admin -n kube-system
kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin

# 6. 获取访问 token
kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}')
Name: dashboard-admin-token-xwd72
Namespace: kube-system
Labels: <none>
Annotations: kubernetes.io/service-account.name: dashboard-admin
kubernetes.io/service-account.uid: 013e9f84-827f-4dc7-81b3-874a28bfebc6

Type: kubernetes.io/service-account-token

Data
====
ca.crt: 1310 bytes
namespace: 11 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6InNQRElCQTlPRUZ5SU54STQ1QWllLXlKMTFCcmZieG0wVTJnRlpzYlBNLXcifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4teHdkNzIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiMDEzZTlmODQtODI3Zi00ZGM3LTgxYjMtODc0YTI4YmZlYmM2Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.O-DI-0IlLFP2pDRKzQYJrZeDAnVvW1IjU-iVwGzvwID7BH0v6kXfWnti07qm8VkuGFJtpuQsmrf6v4sUeRDhr95kZlEVV8Rxnes6oixrkXdk3fR4xreh4lh6ZgCzbER6xI8pMG-j9KNjTRdY6gQPJuOThtI9ab13dpTT5AYpggA2O98DFfgcJ_DzD05hhk6TghOdoro00msHRSUrsEiH0CYa_3PiyPlkvmmY3MlJPsBTdO2pCDzcrjQ2L5EaJAvSh6OodkRY6ymOwfcbfPs3WwSocCEfwkogYOCAQhMC4NU3Jea_hoeFqzLdS1PK5R2rPT-wqemwjDKn0E6jUv6juw

# 7. 访问
https://192.168.80.45:31639

9. 高可用

角色 IP 组件 备注
k8s-master1 192.168.80.45 etcd, api-server, controller-manager, scheduler, kubelet, kube-proxy, docker
k8s-node01 192.168.80.46 etcd, kubelet, kube-proxy, docker
k8s-node02 192.168.80.47 etcd, kubelet, kube-proxy, docker
k8s-master2 192.168.80.49 etcd, api-server, controller-manager, scheduler, kubelet, kube-proxy, docker 新增节点

9.1 准备操作 (Master-1)

9.1.1 kube-apiserver 证书更新

在新增节点的IP段未在证书中时需要如下操作:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
mkdir -p /root/ssl && cd /root/ssl

# 1. 证书签名请求文件
cat > apiserver-csr.json <<EOF
{
"CN": "kubernetes",
"hosts": [
"127.0.0.1",
"localhost",
"192.168.80.1",
"192.168.80.2",
"192.168.80.3",
"192.168.80.45",
"192.168.80.46",
"192.168.80.47",
"192.168.80.48",
"192.168.80.49",
"10.96.0.1",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF

# 2. 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes apiserver-csr.json | cfssljson -bare apiserver

# 3. 证书更新
cp apiserver*.pem /etc/kubernetes/pki
scp apiserver*.pem ubuntu@192.168.80.46:/home/ubuntu
scp apiserver*.pem ubuntu@192.168.80.47:/home/ubuntu

# 4. node节点证书更新
chown root:root /home/ubuntu/apiserver*.pem
mv /home/ubuntu/apiserver*.pem /etc/kubernetes/pki

# 5. 重启 apiserver
systemctl restart kube-apiserver
systemctl status kube-apiserver

9.1.2 增加主机

在 k8s-master1, k8s-node01, k8s-node02 上制作

1
echo '192.168.80.49  k8s-master2' >> /etc/hosts

9.2 扩容 Master

9.2.1 初始化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# 1. 修改主机名
hostnamectl set-hostname k8s-master2

# 2. 主机名解析
cat >> /etc/hosts <<EOF
192.168.80.45 k8s-master1
192.168.80.46 k8s-node01
192.168.80.47 k8s-node02
192.168.80.49 k8s-master2
EOF

# 3. 禁用 swap
swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab

# 4. 将桥接的IPv4流量传递到iptables的链
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system

# 5. 域名解析
echo "nameserver 8.8.8.8" >> /etc/resolv.conf

# 6. 时间同步
apt install ntpdate -y
ntpdate ntp1.aliyun.com

crontab -e
*/30 * * * * /usr/sbin/ntpdate-u ntp1.aliyun.com >> /var/log/ntpdate.log 2>&1

# 7. 日志目录
mkdir -p /var/log/kubernetes

9.2.2 克隆

1
2
3
4
5
6
7
8
9
10
11
# 1. k8s-master1 上执行
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install
tar zcvf master-node-clone.tar.gz /usr/bin/kube* /lib/systemd/system/kube*.service /etc/kubernetes /root/.kube/config /usr/bin/docker* /usr/bin/runc /usr/bin/containerd* /usr/bin/ctr /etc/docker /lib/systemd/system/docker.service

scp master-node-clone.tar.gz ubuntu@192.168.80.49:/home/ubuntu

# 2. k8s-master2 执行
cd / && mv /home/ubuntu/master-node-clone.tar.gz / && tar zxvf master-node-clone.tar.gz && rm -f master-node-clone.tar.gz

rm -f /etc/kubernetes/kubelet.kubeconfig
rm -f /etc/kubernetes/pki/kubelet*

9.2.3 更新配置

1
2
3
4
5
6
7
8
9
10
vi /etc/kubernetes/kube-apiserver.conf 
--bind-address=192.168.80.49 \
--advertise-address=192.168.80.49 \

sed -i 's#k8s-master1#k8s-master2#' /etc/kubernetes/*

sed -i 's#192.168.80.45:6443#192.168.80.49:6443#' /etc/kubernetes/*

vi /root/.kube/config
server: https://192.168.80.49:6443

9.2.4 开机启动

1
2
3
4
systemctl daemon-reload
systemctl start docker kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy
systemctl status docker kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy
systemctl enable docker kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy

9.2.5 集群状态

1
2
3
4
5
6
7
8
kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-2 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
etcd-0 Healthy {"health":"true"}

9.2.6 加入集群

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
node-csr-HfzAqSEc7sIIG9QFHip4vGFnFZhyZnYjBVGWQyGpz54 7m49s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending


# 批准加入
kubectl certificate approve node-csr-HfzAqSEc7sIIG9QFHip4vGFnFZhyZnYjBVGWQyGpz54

kubectl get node
NAME STATUS ROLES AGE VERSION
NAME STATUS ROLES AGE VERSION
k8s-master1 Ready master 27h v1.21.4
k8s-master2 NotReady <none> 11s v1.21.4
k8s-node01 Ready node 27h v1.21.4
k8s-node02 Ready node 27h v1.21.4

9.2.7 打标和污点

1
2
3
4
5
6
7
8
9
10
11
12
13
# 设置标签
kubectl label node k8s-master2 node-role.kubernetes.io/master=

# 设置污点:是master节点无法创建pod
kubectl taint nodes k8s-master2 node-role.kubernetes.io/master=:NoSchedule

# 节点信息
kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
k8s-master1 Ready master 27h v1.21.4 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master1,kubernetes.io/os=linux,node-role.kubernetes.io/master=
k8s-master2 Ready master 2m13s v1.21.4 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master2,kubernetes.io/os=linux,node-role.kubernetes.io/master=
k8s-node01 Ready node 27h v1.21.4 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node01,kubernetes.io/os=linux,node-role.kubernetes.io/node=
k8s-node02 Ready node 27h v1.21.4 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node02,kubernetes.io/os=linux,node-role.kubernetes.io/node=

9.3 高可用负载均衡

img

Nginx: 主流Web服务和反向代理服务器,这里用四层实现对apiserver实现负载均衡。

Keepalived: 主流高可用软件,基于VIP绑定实现服务器双机热备。Keepalived主要根据Nginx运行状态判断是否需要故障转移(漂移VIP),例如当Nginx主节点挂掉,VIP会自动绑定在Nginx备节点,从而保证VIP一直可用,实现Nginx高可用。

服务器规划:

角色 IP 组件
k8s-master1 192.168.80.45 kube-apiserver
k8s-master2 192.168.80.49 kube-apiserver
k8s-loadbalancer1 192.168.80.2 nginx, keepalived
k8s-loadbalancer2 192.168.80.3 nginx, keepalived
VIP 192.168.80.1 虚拟IP

9.3.1 安装软件

1
apt install nginx keepalived -y

9.3.2 配置Nginx

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
cat > /etc/nginx/nginx.conf << "EOF"
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;

include /usr/share/nginx/modules/*.conf;

events {
worker_connections 1024;
}

stream {

log_format main '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent';

access_log /var/log/nginx/k8s-access.log main;

upstream k8s-apiserver {
server 192.168.80.45:6443; # Master1 APISERVER IP:PORT
server 192.168.80.49:6443; # Master2 APISERVER IP:PORT
}

server {
listen 16443;
proxy_pass k8s-apiserver;
}
}

http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';

access_log /var/log/nginx/access.log main;

sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;

include /etc/nginx/mime.types;
default_type application/octet-stream;

server {
listen 80 default_server;
server_name _;

location / {
}
}
}
EOF

9.3.3 keepalived 配置 (master)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
cat > /etc/keepalived/keepalived.conf << EOF
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id NGINX_MASTER
}

# 检查脚本
vrrp_script check_nginx {
script "/etc/keepalived/check_nginx.sh"
}

vrrp_instance VI_1 {
state MASTER
interface ens33 # 修改为实际网卡名
virtual_router_id 51 # VRRP 路由 ID实例,每个实例是唯一的
priority 100 # 优先级,备服务器设置 90
advert_int 1 # 指定VRRP 心跳包通告间隔时间,默认1秒
authentication {
auth_type PASS
auth_pass 1111
}
# 虚拟IP
virtual_ipaddress {
192.168.80.1/24
}
track_script {
check_nginx
}
}
EOF

9.3.4 keepalived 配置 (slave)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
cat > /etc/keepalived/keepalived.conf << EOF
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id NGINX_BACKUP
}

# 检查脚本
vrrp_script check_nginx {
script "/etc/keepalived/check_nginx.sh"
}

vrrp_instance VI_1 {
state BACKUP
interface ens33 # 修改为实际网卡名
virtual_router_id 51 # VRRP 路由 ID实例,每个实例是唯一的
priority 90 # 优先级,备服务器设置 90
advert_int 1 # 指定VRRP 心跳包通告间隔时间,默认1秒
authentication {
auth_type PASS
auth_pass 1111
}
# 虚拟IP
virtual_ipaddress {
192.168.80.1/24
}
track_script {
check_nginx
}
}
EOF

9.3.5 keepalived 检查脚本

1
2
3
4
5
6
7
8
9
10
11
12
cat > /etc/keepalived/check_nginx.sh  << "EOF"
#!/bin/bash
count=$(ss -antp |grep 16443 |egrep -cv "grep|$$")

if [ "$count" -eq 0 ];then
exit 1
else
exit 0
fi
EOF

chmod +x /etc/keepalived/check_nginx.sh

9.3.6 启动服务

1
2
3
systemctl daemon-reload
systemctl start nginx keepalived
systemctl enable nginx keepalived

9.3.7 状态检查

1
2
3
4
5
6
7
8
9
10
11
12
13
14
ip addr

curl -k https://192.168.80.1:16443/version
{
"major": "1",
"minor": "19",
"gitVersion": "v1.21.4",
"gitCommit": "c6a2f08fc4378c5381dd948d9ad9d1080e3e6b33",
"gitTreeState": "clean",
"buildDate": "2021-05-12T12:19:22Z",
"goVersion": "go1.15.12",
"compiler": "gc",
"platform": "linux/amd64"
}

9.3.8 Worker Node 连接到 LB VIP

1
2
3
4
5
6
7
8
9
sed -i 's#192.168.80.45:6443#192.168.80.1:16443#' /etc/kubernetes/*
systemctl restart kubelet kube-proxy

kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master1 Ready master 3d17h v1.21.4
k8s-master2 Ready master 2d16h v1.21.4
k8s-node01 Ready node 3d15h v1.21.4
k8s-node02 Ready node 3d15h v1.21.4

10. 删除节点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# 1. k8s-master2 上,停止kubelet进程
systemctl stop kubelet

# 2. 检查 k8s-master2 是否已下线
kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master1 Ready master 40h v1.21.4
k8s-master2 NotReady master 12h v1.21.4
k8s-node01 Ready node 40h v1.21.4
k8s-node02 Ready node 40h v1.21.4

# 3. 删除节点
kubectl drain k8s-master2
node/k8s-master2 cordoned
error: unable to drain node "k8s-master2", aborting command...

There are pending nodes to be drained:
k8s-master2
error: cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): kube-system/calico-node-lwj2r

# 4. 强制下线
kubectl drain k8s-master2 --ignore-daemonsets
node/k8s-master2 already cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-lwj2r
node/k8s-master2 drained

# 5. 下线状态
kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master1 Ready master 40h v1.21.4
k8s-master2 Ready,SchedulingDisabled master 12h v1.21.4
k8s-node01 Ready node 39h v1.21.4
k8s-node02 Ready node 39h v1.21.4

# 6. 恢复操作 (如有必要)
kubectl uncordon k8s-master2
node/k8s-master2 uncordoned

kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master1 Ready master 40h v1.21.4
k8s-master2 Ready master 12h v1.21.4
k8s-node01 Ready node 39h v1.21.4
k8s-node02 Ready node 39h v1.21.4

# 7. 彻底删除
kubectl delete node k8s-master2

kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master1 Ready master 41h v1.21.4
k8s-node01 Ready node 40h v1.21.4
k8s-node02 Ready node 40h v1.21.4

Z. 补充

组件日志级别

1
2
3
4
5
6
7
8
--v=0	Generally useful for this to ALWAYS be visible to an operator.
--v=1 A reasonable default log level if you don’t want verbosity.
--v=2 Useful steady state information about the service and important log messages that may correlate to significant changes in the system. This is the recommended default log level for most systems.
--v=3 Extended information about changes.
--v=4 Debug level verbosity.
--v=6 Display requested resources.
--v=7 Display HTTP request headers.
--v=8 Display HTTP request contents