# 资源详细解释 $ kubectl explain svc KIND: Service VERSION: v1
DESCRIPTION: Service is a named abstraction of software service (for example, mysql) consisting of local port (for example 3306) that the proxy listens on, and the selector that determines which pods will answer requests sent through the proxy.
FIELDS: apiVersion <string> APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
kind <string> Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
metadata <Object> Standard object's metadata. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata spec <Object> Spec defines the behavior of a service. https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status status <Object> Most recently observed status of the service. Populated by the system. Read-only. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status # 集群支持的API版本 $ kubectl api-versions admissionregistration.k8s.io/v1 admissionregistration.k8s.io/v1beta1 apiextensions.k8s.io/v1 apiextensions.k8s.io/v1beta1 apiregistration.k8s.io/v1 apiregistration.k8s.io/v1beta1 apps/v1 authentication.k8s.io/v1 authentication.k8s.io/v1beta1 authorization.k8s.io/v1 authorization.k8s.io/v1beta1 autoscaling/v1 autoscaling/v2beta1 autoscaling/v2beta2 batch/v1 batch/v1beta1 certificates.k8s.io/v1 certificates.k8s.io/v1beta1 coordination.k8s.io/v1 coordination.k8s.io/v1beta1 discovery.k8s.io/v1beta1 events.k8s.io/v1 events.k8s.io/v1beta1 extensions/v1beta1 networking.k8s.io/v1 networking.k8s.io/v1beta1 node.k8s.io/v1beta1 policy/v1beta1 rbac.authorization.k8s.io/v1 rbac.authorization.k8s.io/v1beta1 scheduling.k8s.io/v1 scheduling.k8s.io/v1beta1 storage.k8s.io/v1 storage.k8s.io/v1beta1 v1
# 不满足依旧创建成功 kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES preferred-affinity-pod 1/1 Running 0 61s 10.244.2.18 k8s-node1 <none> <none>
# 不满足无法创建成功 kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES required-affinity-pod 0/1 Pending 0 18s <none> <none> <none> <none>
# 创建成功 kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES required-affinity-pod 1/1 Running 0 2m31s 10.244.2.19 k8s-node1 <none> <none>
# 无法正常创建 kubectl get pod NAME READY STATUS RESTARTS AGE toleration-pod 0/1 Pending 0 5s
kubectl describe pod toleration-pod Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 26s (x2 over 26s) default-scheduler 0/4 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) had taint {node-type: }, that the pod didn't tolerate.
# 调度成功 kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES toleration-pod 1/1 Running 0 11m 10.244.2.21 k8s-node1 <none> <none>
4. Kubelet
每个Node节点上都运行一个 Kubelet 服务进程,默认监听 10250 端口,接收并执行 Master 发来的指令,管理 Pod 及 Pod 中的容器。每个 Kubelet 进程会在 API Server 上注册所在Node节点的信息,定期向 Master 节点汇报该节点的资源使用情况,并通过 cAdvisor 监控节点和容器的资源。可以把kubelet理解成【Server-Agent】架构中的agent,是Node上的Pod管家。
所有以非 API Server 方式创建的 Pod 都叫 Static Pod。Kubelet 将 Static Pod 的状态汇报给 API Server,API Server 为该 Static Pod 创建一个 Mirror Pod 和其相匹配。Mirror Pod 的状态将真实反映 Static Pod 的状态。当 Static Pod 被删除时,与之相对应的 Mirror Pod 也会被删除。
4.3 cAdvisor 资源监控
资源监控级别:容器,Pod,Service,整个集群
Heapster: 为k8s提供了一个级别的监控平台,它是集群级别的监控和事件数据集成器(Aggregator)。它以Pod方式运行在集群中,并通过 kubelet 发现所有运行在集群中的节点,查看来自这些节点的资源使用情况。kubelet 通过 cAdvisor 获取其所在节点即容器的数据。Heapster通过带着关联标签的 Pod 分组信息,它们被推送到一个可配置的后端,用于存储和可视化展示。
cat > kube-dns.yaml <<EOF apiVersion: v1 kind: Service metadata: name: kube-dns namespace: kube-system labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile kubernetes.io/name: "KubeDNS" spec: selector: k8s-app: kube-dns clusterIP: 10.0.0.2 ports: - name: dns port: 53 protocol: UDP - name: dns-tcp port: 53 protocol: TCP --- apiVersion: v1 kind: ServiceAccount metadata: name: kube-dns namespace: kube-system labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile --- apiVersion: v1 kind: ConfigMap metadata: name: kube-dns namespace: kube-system labels: addonmanager.kubernetes.io/mode: EnsureExists --- apiVersion: apps/v1 kind: Deployment metadata: name: kube-dns namespace: kube-system labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile spec: # replicas: not specified here: # 1. In order to make Addon Manager do not reconcile this replicas parameter. # 2. Default is 1. # 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on. strategy: rollingUpdate: maxSurge: 10% maxUnavailable: 0 selector: matchLabels: k8s-app: kube-dns template: metadata: labels: k8s-app: kube-dns annotations: prometheus.io/port: "10054" prometheus.io/scrape: "true" spec: priorityClassName: system-cluster-critical securityContext: seccompProfile: type: RuntimeDefault supplementalGroups: [ 65534 ] fsGroup: 65534 affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: k8s-app operator: In values: ["kube-dns"] topologyKey: kubernetes.io/hostname tolerations: - key: "CriticalAddonsOnly" operator: "Exists" volumes: - name: kube-dns-config configMap: name: kube-dns optional: true nodeSelector: kubernetes.io/os: linux containers: - name: kubedns image: k8s.gcr.io/dns/k8s-dns-kube-dns:1.17.3 resources: # TODO: Set memory limits when we've profiled the container for large # clusters, then set request = limit to keep this container in # guaranteed class. Currently, this container falls into the # "burstable" category so the kubelet doesn't backoff from restarting it. limits: memory: 170Mi requests: cpu: 100m memory: 70Mi livenessProbe: httpGet: path: /healthcheck/kubedns port: 10054 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 readinessProbe: httpGet: path: /readiness port: 8081 scheme: HTTP # we poll on pod startup for the Kubernetes master service and # only setup the /readiness HTTP server once that's available. initialDelaySeconds: 3 timeoutSeconds: 5 args: - --domain=cluster.local. - --dns-port=10053 - --config-dir=/kube-dns-config - --v=2 env: - name: PROMETHEUS_PORT value: "10055" ports: - containerPort: 10053 name: dns-local protocol: UDP - containerPort: 10053 name: dns-tcp-local protocol: TCP - containerPort: 10055 name: metrics protocol: TCP volumeMounts: - name: kube-dns-config mountPath: /kube-dns-config securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true runAsUser: 1001 runAsGroup: 1001 - name: dnsmasq image: k8s.gcr.io/dns/k8s-dns-dnsmasq-nanny:1.17.3 livenessProbe: httpGet: path: /healthcheck/dnsmasq port: 10054 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 args: - -v=2 - -logtostderr - -configDir=/etc/k8s/dns/dnsmasq-nanny - -restartDnsmasq=true - -- - -k - --cache-size=1000 - --no-negcache - --dns-loop-detect - --log-facility=- - --server=/cluster.local/127.0.0.1#10053 - --server=/in-addr.arpa/127.0.0.1#10053 - --server=/ip6.arpa/127.0.0.1#10053 ports: - containerPort: 53 name: dns protocol: UDP - containerPort: 53 name: dns-tcp protocol: TCP # see: https://github.com/kubernetes/kubernetes/issues/29055 for details resources: requests: cpu: 150m memory: 20Mi volumeMounts: - name: kube-dns-config mountPath: /etc/k8s/dns/dnsmasq-nanny securityContext: capabilities: drop: - all add: - NET_BIND_SERVICE - SETGID - name: sidecar image: k8s.gcr.io/dns/k8s-dns-sidecar:1.17.3 livenessProbe: httpGet: path: /metrics port: 10054 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 args: - --v=2 - --logtostderr - --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,SRV - --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,SRV ports: - containerPort: 10054 name: metrics protocol: TCP resources: requests: memory: 20Mi cpu: 10m securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true runAsUser: 1001 runAsGroup: 1001 dnsPolicy: Default # Don't use cluster DNS. serviceAccountName: kube-dns EOF
kubectl apply -f kube-dns.yaml kubectl get pod -n kube-system
6.1.3 相关问题
1. 问题定位
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
# 发现问题 kubectl describe pod kube-dns-594c5b5cb5-mdxp6 -n kube-system ... Normal Pulled 13m kubelet Container image "k8s.gcr.io/dns/k8s-dns-kube-dns:1.17.3" already present on machine Warning Unhealthy 12m (x2 over 13m) kubelet Liveness probe failed: HTTP probe failed with statuscode: 503 Warning Unhealthy 9m32s (x25 over 13m) kubelet Readiness probe failed: Get "http://10.244.2.28:8081/readiness": dial tcp 10.244.2.28:8081: connect: connection refused Warning BackOff 4m30s (x19 over 10m) kubelet Back-off restarting failed container
# 查看容器日志 kubectl logs kube-dns-594c5b5cb5-mdxp6 kubedns -n kube-system ... I0520 05:59:53.947378 1 server.go:195] Skydns metrics enabled (/metrics:10055) I0520 05:59:53.947996 1 log.go:172] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0] I0520 05:59:53.948005 1 log.go:172] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0] E0520 05:59:53.957842 1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.go:98: Failed to list *v1.Service: services is forbidden: User "system:serviceaccount:kube-system:kube-dns" cannot list resource "services"in API group "" at the cluster scope: RBAC: clusterrole.rbac.authorization.k8s.io "system:kube-dns" not found E0520 05:59:53.957894 1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.go:98: Failed to list *v1.Endpoints: endpoints is forbidden: User "system:serviceaccount:kube-system:kube-dns" cannot list resource "endpoints"in API group "" at the cluster scope: RBAC: clusterrole.rbac.authorization.k8s.io "system:kube-dns" not found I0520 05:59:54.447988 1 dns.go:220] Waiting for [endpoints services] to be initialized from apiserver...
kubectl get pod -n kube-system -o wide | grep kube-dns kube-dns-594c5b5cb5-6wttp 3/3 Running 0 13m 10.244.2.29 k8s-node1 <none> <none>
kubectl describe pod kube-dns-594c5b5cb5-mdxp6 -n kube-system ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 48s default-scheduler Successfully assigned kube-system/kube-dns-594c5b5cb5-6wttp to k8s-node1 Normal Pulled 47s kubelet Container image "k8s.gcr.io/dns/k8s-dns-kube-dns:1.17.3" already present on machine Normal Created 47s kubelet Created container kubedns Normal Started 47s kubelet Started container kubedns Normal Pulled 47s kubelet Container image "k8s.gcr.io/dns/k8s-dns-dnsmasq-nanny:1.17.3" already present on machine Normal Created 47s kubelet Created container dnsmasq Normal Started 47s kubelet Started container dnsmasq Normal Pulled 47s kubelet Container image "k8s.gcr.io/dns/k8s-dns-sidecar:1.17.3" already present on machine Normal Created 47s kubelet Created container sidecar Normal Started 47s kubelet Started container sidecar
6.2 CoreDNS
kube-dns 的升级版。CoreDNS 的效率更高,资源占用更小
6.2.1 安装 coredns
1 2 3 4 5 6 7 8 9 10 11
wget https://github.com/coredns/deployment/archive/refs/tags/coredns-1.14.0.tar.gz tar zxvf coredns-1.14.0.tar.gz cd deployment-coredns-1.14.0/kubernetes