• 安装部署
  • 集群机器重启nodelocaldns和ks-apiserver服务CrashLoopBackOff

操作系统信息
实体机器,108核128G
操作系统:

Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.6 LTS
Release:	20.04
Codename:	focal

Kubernetes版本信息
kubectl version 命令执行结果贴在下方

Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.12", GitCommit:"b058e1760c79f46a834ba59bd7a3486ecf28237d", GitTreeState:"clean", BuildDate:"2022-07-13T14:59:18Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.12", GitCommit:"b058e1760c79f46a834ba59bd7a3486ecf28237d", GitTreeState:"clean", BuildDate:"2022-07-13T14:53:39Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}

容器运行时
docker version / crictl version / nerdctl version 结果贴在下方

Client:
 Version:           20.10.8
 API version:       1.41
 Go version:        go1.16.6
 Git commit:        3967b7d
 Built:             Fri Jul 30 19:50:40 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.8
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.6
  Git commit:       75249d8
  Built:            Fri Jul 30 19:55:09 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.4.9
  GitCommit:        e25210fe30a0a703442421b0f60afac609f950a3
 runc:
  Version:          1.0.1
  GitCommit:        v1.0.1-0-g4144b638
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
  • crictl version
    没有这个命令
  • nerdctl version
    没有这个命令

KubeSphere版本信息
例如:v2.1.1/v3.0.0。离线安装还是在线安装。在已有K8s上安装还是使用kk安装。

kk version: &version.Info{Major:"3", Minor:"0", GitVersion:"v3.0.7", GitCommit:"e755baf67198d565689d7207378174f429b508ba", GitTreeState:"clean", BuildDate:"2023-01-18T01:57:24Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}

离线安装

问题是什么
已经离线安装成功了,重启了整个集群之后,web页面登录不了,提示“request to http://ks-apiserver/oauth/token failed, reason: getaddrinfo EAI_AGAIN ks-apiserver”

NAMESPACE                      NAME                                               READY   STATUS             RESTARTS        AGE
kube-system                    calico-kube-controllers-846ddd49bc-6szkv           1/1     Running            5 (64s ago)     28h
kube-system                    calico-node-8tkmg                                  1/1     Running            6 (112m ago)    28h
kube-system                    calico-node-mbh7m                                  1/1     Running            2 (116m ago)    28h
kube-system                    calico-node-mpvtq                                  1/1     Running            3 (116m ago)    28h
kube-system                    calico-node-sz7qm                                  1/1     Running            2 (116m ago)    28h
kube-system                    coredns-558b97598-fwhww                            1/1     Running            3 (116m ago)    28h
kube-system                    coredns-558b97598-s64mn                            1/1     Running            3 (116m ago)    28h
kube-system                    haproxy-worker-216-01                              1/1     Running            2 (116m ago)    28h
kube-system                    kube-apiserver-master-213-01                       1/1     Running            37 (115m ago)   28h
kube-system                    kube-apiserver-master-214-02                       1/1     Running            31 (116m ago)   28h
kube-system                    kube-apiserver-master-215-03                       1/1     Running            31 (116m ago)   28h
kube-system                    kube-controller-manager-master-213-01              1/1     Running            7 (116m ago)    28h
kube-system                    kube-controller-manager-master-214-02              1/1     Running            4 (116m ago)    28h
kube-system                    kube-controller-manager-master-215-03              1/1     Running            3 (116m ago)    28h
kube-system                    kube-proxy-4dpjk                                   1/1     Running            2 (116m ago)    28h
kube-system                    kube-proxy-dlfjk                                   1/1     Running            7 (112m ago)    28h
kube-system                    kube-proxy-mj6fp                                   1/1     Running            4 (116m ago)    28h
kube-system                    kube-proxy-x9nwz                                   1/1     Running            2 (116m ago)    28h
kube-system                    kube-scheduler-master-213-01                       1/1     Running            7 (116m ago)    28h
kube-system                    kube-scheduler-master-214-02                       1/1     Running            3 (116m ago)    28h
kube-system                    kube-scheduler-master-215-03                       1/1     Running            3 (116m ago)    28h
kube-system                    nodelocaldns-9tvcv                                 0/1     CrashLoopBackOff   80 (25s ago)    28h
kube-system                    nodelocaldns-ddvg2                                 0/1     CrashLoopBackOff   95 (35s ago)    28h
kube-system                    nodelocaldns-fmmrm                                 0/1     CrashLoopBackOff   61 (32s ago)    28h
kube-system                    nodelocaldns-g4f4x                                 0/1     CrashLoopBackOff   84 (48s ago)    28h
kube-system                    openebs-localpv-provisioner-6f54869bc7-6mn6b       0/1     Error              29              8h
kube-system                    snapshot-controller-0                              1/1     Running            1 (116m ago)    6h45m
kubesphere-controls-system     default-http-backend-59d5cf569f-4gsjb              0/1     Error              0               8h
kubesphere-controls-system     kubectl-admin-7ffdf4596b-82rfv                     1/1     Running            1 (116m ago)    8h
kubesphere-monitoring-system   alertmanager-main-0                                1/2     Running            2 (116m ago)    6h45m
kubesphere-monitoring-system   alertmanager-main-1                                0/2     Completed          0               6h45m
kubesphere-monitoring-system   alertmanager-main-2                                0/2     Completed          0               6h45m
kubesphere-monitoring-system   kube-state-metrics-5474f8f7b-sfjfc                 0/3     Completed          1               8h
kubesphere-monitoring-system   node-exporter-cq78v                                2/2     Running            4 (116m ago)    28h
kubesphere-monitoring-system   node-exporter-fs6lh                                2/2     Running            10 (112m ago)   28h
kubesphere-monitoring-system   node-exporter-svwtp                                2/2     Running            4 (116m ago)    28h
kubesphere-monitoring-system   node-exporter-wtbgp                                2/2     Running            8 (116m ago)    28h
kubesphere-monitoring-system   notification-manager-deployment-7b586bd8fb-j4g86   0/2     Error              2               8h
kubesphere-monitoring-system   notification-manager-deployment-7b586bd8fb-ljfz4   2/2     Running            4 (113m ago)    8h
kubesphere-monitoring-system   notification-manager-operator-64ff97cb98-j2tzf     0/2     Completed          30              8h
kubesphere-monitoring-system   prometheus-k8s-0                                   0/2     Error              0               6h45m
kubesphere-monitoring-system   prometheus-k8s-1                                   0/2     Error              0               6h45m
kubesphere-monitoring-system   prometheus-operator-64b7b4db85-qhhbn               0/2     Completed          1               8h
kubesphere-system              ks-apiserver-848bfd75fd-4tnbz                      0/1     CrashLoopBackOff   58 (23s ago)    28h
kubesphere-system              ks-apiserver-848bfd75fd-cjbwv                      0/1     Error              47 (57s ago)    6h49m
kubesphere-system              ks-apiserver-848bfd75fd-k52lc                      0/1     CrashLoopBackOff   59 (18s ago)    28h
kubesphere-system              ks-console-868887c49f-9ltmd                        1/1     Running            3 (116m ago)    28h
kubesphere-system              ks-console-868887c49f-lql7m                        1/1     Running            1 (112m ago)    6h50m
kubesphere-system              ks-console-868887c49f-vsn57                        1/1     Running            2 (116m ago)    28h
kubesphere-system              ks-controller-manager-67b896bb6d-2rrrb             1/1     Running            5 (43s ago)     28h
kubesphere-system              ks-controller-manager-67b896bb6d-9dx65             1/1     Running            1 (112m ago)    6h49m
kubesphere-system              ks-controller-manager-67b896bb6d-tlj8h             1/1     Running            3 (116m ago)    28h
kubesphere-system              ks-installer-5655f896fb-5k28b                      0/1     Completed          1               8h
kubesphere-system              redis-7cc8746478-g2p9c                             1/1     Running            5 (112m ago)    28h
pic-distribute                 mysql-v1-0                                         0/1     Completed          0               6h44m
  • 没有启动成功的过滤出来如下

    kubectl get pods -A|grep -v Running
    NAMESPACE                      NAME                                               READY   STATUS             RESTARTS         AGE
    kube-system                    nodelocaldns-9tvcv                                 0/1     CrashLoopBackOff   86 (35s ago)     28h
    kube-system                    nodelocaldns-ddvg2                                 0/1     CrashLoopBackOff   99 (3m4s ago)    28h
    kube-system                    nodelocaldns-fmmrm                                 0/1     CrashLoopBackOff   66 (2m23s ago)   28h
    kube-system                    nodelocaldns-g4f4x                                 0/1     CrashLoopBackOff   87 (4m43s ago)   28h
    kubesphere-system              ks-apiserver-848bfd75fd-4tnbz                      0/1     CrashLoopBackOff   64 (61s ago)     28h
    kubesphere-system              ks-apiserver-848bfd75fd-cjbwv                      0/1     CrashLoopBackOff   51 (2m25s ago)   7h6m
    kubesphere-system              ks-apiserver-848bfd75fd-k52lc                      0/1     CrashLoopBackOff   65 (56s ago)     28h
  • nodelocaldns-9tvcv -n kube-system的描述如下

    kubectl describe nodelocaldns-9tvcv -n kube-system
    error: the server doesn't have a resource type "nodelocaldns-9tvcv"
    root@master-214-02:~# kubectl describe pod nodelocaldns-9tvcv -n kube-system
    Name:                 nodelocaldns-9tvcv
    Namespace:            kube-system
    Priority:             2000000000
    Priority Class Name:  system-cluster-critical
    Node:                 worker-216-01/192.168.50.216
    Start Time:           Thu, 16 Nov 2023 06:26:21 +0000
    Labels:               controller-revision-hash=5855b6bfd
                          k8s-app=nodelocaldns
                          pod-template-generation=1
    Annotations:          prometheus.io/port: 9253
                          prometheus.io/scrape: true
    Status:               Running
    IP:                   192.168.50.216
    IPs:
      IP:           192.168.50.216
    Controlled By:  DaemonSet/nodelocaldns
    Containers:
      node-cache:
        Container ID:  docker://bcdd43f3ea3658b7cd4aabc9afd5fc9897ab77a6835744d916abaf95275a20f8
        Image:         dockerhub.kubekey.local/kubesphereio/k8s-dns-node-cache:1.15.12
        Image ID:      docker-pullable://dockerhub.kubekey.local/kubesphereio/k8s-dns-node-cache@sha256:b6b9dc5cb4ab54aea6905ceeceb61e54791bc2acadecbea65db3641d99c7fe69
        Ports:         53/UDP, 53/TCP, 9253/TCP
        Host Ports:    53/UDP, 53/TCP, 9253/TCP
        Args:
          -localip
          169.254.25.10
          -conf
          /etc/coredns/Corefile
          -upstreamsvc
          coredns
        State:          Waiting
          Reason:       CrashLoopBackOff
        Last State:     Terminated
          Reason:       Error
          Exit Code:    1
          Started:      Fri, 17 Nov 2023 11:04:09 +0000
          Finished:     Fri, 17 Nov 2023 11:04:09 +0000
        Ready:          False
        Restart Count:  86
        Limits:
          memory:  170Mi
        Requests:
          cpu:        100m
          memory:     70Mi
        Liveness:     http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
        Readiness:    http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
        Environment:  <none>
        Mounts:
          /etc/coredns from config-volume (rw)
          /run/xtables.lock from xtables-lock (rw)
          /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vlnth (ro)
    Conditions:
      Type              Status
      Initialized       True 
      Ready             False 
      ContainersReady   False 
      PodScheduled      True 
    Volumes:
      config-volume:
        Type:      ConfigMap (a volume populated by a ConfigMap)
        Name:      nodelocaldns
        Optional:  false
      xtables-lock:
        Type:          HostPath (bare host directory volume)
        Path:          /run/xtables.lock
        HostPathType:  FileOrCreate
      kube-api-access-vlnth:
        Type:                    Projected (a volume that contains injected data from multiple sources)
        TokenExpirationSeconds:  3607
        ConfigMapName:           kube-root-ca.crt
        ConfigMapOptional:       <nil>
        DownwardAPI:             true
    QoS Class:                   Burstable
    Node-Selectors:              <none>
    Tolerations:                 :NoSchedule op=Exists
                                 :NoExecute op=Exists
                                 CriticalAddonsOnly op=Exists
                                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                                 node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                                 node.kubernetes.io/not-ready:NoExecute op=Exists
                                 node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                                 node.kubernetes.io/unreachable:NoExecute op=Exists
                                 node.kubernetes.io/unschedulable:NoSchedule op=Exists
    Events:
      Type     Reason          Age                   From     Message
      ----     ------          ----                  ----     -------
      Normal   SandboxChanged  18m                   kubelet  Pod sandbox changed, it will be killed and re-created.
      Warning  Unhealthy       18m                   kubelet  Readiness probe failed: Get "http://169.254.25.10:9254/health": dial tcp 169.254.25.10:9254: connect: connection refused
      Normal   Started         17m (x3 over 18m)     kubelet  Started container node-cache
      Normal   Pulled          16m (x4 over 18m)     kubelet  Container image "dockerhub.kubekey.local/kubesphereio/k8s-dns-node-cache:1.15.12" already present on machine
      Normal   Created         16m (x4 over 18m)     kubelet  Created container node-cache
      Warning  BackOff         3m54s (x78 over 18m)  kubelet  Back-off restarting failed container
  • nodelocaldns-9tvcv日志

    kubectl logs -f nodelocaldns-9tvcv -n kube-system
    2023/11/17 11:04:09 [INFO] Using Corefile /etc/coredns/Corefile
    2023/11/17 11:04:09 [ERROR] Failed to read node-cache coreFile /etc/coredns/Corefile.base - open /etc/coredns/Corefile.base: no such file or directory
    2023/11/17 11:04:09 [ERROR] Failed to sync kube-dns config directory /etc/kube-dns, err: lstat /etc/kube-dns: no such file or directory
    cluster.local.:53 on 169.254.25.10
    in-addr.arpa.:53 on 169.254.25.10
    ip6.arpa.:53 on 169.254.25.10
    .:53 on 169.254.25.10
    [INFO] plugin/reload: Running configuration MD5 = adf97d6b4504ff12113ebb35f0c6413e
    CoreDNS-1.6.7
    linux/amd64, go1.11.13, 
    [FATAL] plugin/loop: Loop (169.254.25.10:54381 -> 169.254.25.10:53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 2309482293177081995.1716274136328598708."
  • ks-apiserver-848bfd75fd-4tnbz的日志

    kubectl describe pod -n kubesphere-system              ks-apiserver-848bfd75fd-4tnbz  
    Name:         ks-apiserver-848bfd75fd-4tnbz
    Namespace:    kubesphere-system
    Priority:     0
    Node:         master-214-02/192.168.50.214
    Start Time:   Thu, 16 Nov 2023 06:28:52 +0000
    Labels:       app=ks-apiserver
                  pod-template-hash=848bfd75fd
                  tier=backend
    Annotations:  cni.projectcalico.org/containerID: 4e2209e4b5c0d712d66cab6b82f276cc3f5eabb2451ddd5ad37526995963239d
                  cni.projectcalico.org/podIP: 10.233.109.20/32
                  cni.projectcalico.org/podIPs: 10.233.109.20/32
    Status:       Running
    IP:           10.233.109.20
    IPs:
      IP:           10.233.109.20
    Controlled By:  ReplicaSet/ks-apiserver-848bfd75fd
    Containers:
      ks-apiserver:
        Container ID:  docker://a29abea9b50633e849ba87955f72e71d093be684691c3f1ab279aade83dacacc
        Image:         dockerhub.kubekey.local/kubesphereio/ks-apiserver:v3.3.2
        Image ID:      docker-pullable://dockerhub.kubekey.local/kubesphereio/ks-apiserver@sha256:78d856f371d0981f9acef156da3869cf8b0a609bedf93f7d6a6d98d77d40ecd8
        Port:          9090/TCP
        Host Port:     0/TCP
        Command:
          ks-apiserver
          --logtostderr=true
        State:          Waiting
          Reason:       CrashLoopBackOff
        Last State:     Terminated
          Reason:       Error
          Exit Code:    1
          Started:      Fri, 17 Nov 2023 11:08:56 +0000
          Finished:     Fri, 17 Nov 2023 11:09:01 +0000
        Ready:          False
        Restart Count:  65
        Limits:
          cpu:     1
          memory:  1Gi
        Requests:
          cpu:     20m
          memory:  100Mi
        Liveness:  http-get http://:9090/kapis/version delay=15s timeout=15s period=10s #success=1 #failure=8
        Environment:
          KUBESPHERE_CACHE_OPTIONS_PASSWORD:  <set to the key 'auth' in secret 'redis-secret'>  Optional: false
        Mounts:
          /etc/kubesphere/ from kubesphere-config (rw)
          /etc/kubesphere/ingress-controller from ks-router-config (rw)
          /etc/localtime from host-time (ro)
          /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-6lhcp (ro)
    Conditions:
      Type              Status
      Initialized       True 
      Ready             False 
      ContainersReady   False 
      PodScheduled      True 
    Volumes:
      ks-router-config:
        Type:      ConfigMap (a volume populated by a ConfigMap)
        Name:      ks-router-config
        Optional:  false
      kubesphere-config:
        Type:      ConfigMap (a volume populated by a ConfigMap)
        Name:      kubesphere-config
        Optional:  false
      host-time:
        Type:          HostPath (bare host directory volume)
        Path:          /etc/localtime
        HostPathType:  
      kube-api-access-6lhcp:
        Type:                    Projected (a volume that contains injected data from multiple sources)
        TokenExpirationSeconds:  3607
        ConfigMapName:           kube-root-ca.crt
        ConfigMapOptional:       <nil>
        DownwardAPI:             true
    QoS Class:                   Burstable
    Node-Selectors:              <none>
    Tolerations:                 CriticalAddonsOnly op=Exists
                                 node-role.kubernetes.io/master:NoSchedule
                                 node.kubernetes.io/not-ready:NoExecute op=Exists for 60s
                                 node.kubernetes.io/unreachable:NoExecute op=Exists for 60s
    Events:
      Type     Reason          Age                   From     Message
      ----     ------          ----                  ----     -------
      Normal   SandboxChanged  24m                   kubelet  Pod sandbox changed, it will be killed and re-created.
      Warning  Unhealthy       23m (x3 over 23m)     kubelet  Liveness probe failed: Get "http://10.233.109.20:9090/kapis/version": dial tcp 10.233.109.20:9090: connect: connection refused
      Normal   Pulled          21m (x4 over 24m)     kubelet  Container image "dockerhub.kubekey.local/kubesphereio/ks-apiserver:v3.3.2" already present on machine
      Normal   Created         21m (x4 over 24m)     kubelet  Created container ks-apiserver
      Normal   Started         21m (x4 over 24m)     kubelet  Started container ks-apiserver
      Warning  BackOff         4m13s (x92 over 23m)  kubelet  Back-off restarting failed container
  • ks-apiserver-848bfd75fd-4tnbz的日志

    kubectl logs -f -n kubesphere-system              ks-apiserver-848bfd75fd-4tnbz  
    W1117 11:08:56.216916       1 client_config.go:615] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
    W1117 11:08:56.219836       1 client_config.go:615] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
    W1117 11:08:56.236662       1 metricsserver.go:238] Metrics API not available.
    E1117 11:09:01.237069       1 cache.go:76] failed to create cache, error: dial tcp: i/o timeout
    Error: failed to create cache, error: dial tcp: i/o timeout
    2023/11/17 11:09:01 failed to create cache, error: dial tcp: i/o timeout