Kubernetes 故障排查指南

Summary: Author: 张亚飞 | Read Time: 1 minute read | Published: 2019-10-28
Filed under Categories: KubernetesTags: Kubernetes, Cilium,

Kubernetes 故障排查指南

Kubernetes 故障排查指南

NodePort

因为公网IP问题,k8s 一个 deployment 配置了 hostNetworknodeName

apiVersion: apps/v1
kind: Deployment
metadata:
  name: release-name-vloud-collections
spec:
  replicas: 1
  strategy:
    type: RollingUpdate
  selector:
    matchLabels:
      app.kubernetes.io/name: vloud-collections
      app.kubernetes.io/instance: release-name
  template:
    spec:
      hostNetwork: true
      nodeName: bjy-idc-bdata-k8s-test01
      containers:
        - name: vloud-collections
          image: "harbor.baijiayun.com/bdata/vloud-collections:release-test"
          imagePullPolicy: Always
          command:
            - "/app/vcollections"
            - "--env=test"

当更新镜像时,发现大量容器被创建

kubectl -n bdata set image deployments/vloud-collections vloud-collections=harbor.baijiayun.com/bdata/vloud-collections:release-test-bdda9f43

执行集群环境配置查询命令:

kubectl get pod --all-namespaces | grep vloud-c

|:=======================================================================================>
bdata                          vloud-collections-69d8c56d48-b99bc                                1/1     Running     0                2d18h
bdata                          vloud-collections-796988d9fb-22dp5                                0/1     NodePorts   0                8m
bdata                          vloud-collections-796988d9fb-22vwh                                0/1     NodePorts   0                56s
bdata                          vloud-collections-796988d9fb-22w5n                                0/1     NodePorts   0                2m5s
bdata                          vloud-collections-796988d9fb-242jp                                0/1     NodePorts   0                6m6s
bdata                          vloud-collections-796988d9fb-24ztg                                0/1     NodePorts   0                3m15s
bdata                          vloud-collections-796988d9fb-25b8k                                0/1     NodePorts   0                5m39s
bdata                          vloud-collections-796988d9fb-25w58                                0/1     NodePorts   0                2m9s
bdata                          vloud-collections-796988d9fb-27445                                0/1     NodePorts   0                2m13s
bdata                          vloud-collections-796988d9fb-276pk                                0/1     NodePorts   0                2m31s

造成这种问题的原因是当我们设置了 hostNetworknodePort 后,但是默认调度策略是 RollingUpdate,但是因为端口冲突,新的 Pod 没创建出来,旧的 Pod 也没有释放,导致k8s不停地调度。

解决方案将发布策略设置为 Recreate,先释放旧的 pod,再发布新的 pod

spec:
  strategy:
    type: Recreate

Comments

Cor-Ethan, the beverage → www.iirii.com