• 安装部署
  • v4.1.2 離線安裝報錯 etcd health check failed

  • CauchyK零SK壹S

cici

  1. 检查 docker ps 有没有 kube-apiserver 被创建,如果创建出来了,curl -k https://<master01 ip>:6443;如果没创建,检查kubelet日志
  2. 如果使用了负载均衡,curl -k https://<vip>:6443 验证负载均衡是不是通的
  • cici 回复了此帖

    Cauchy
    kubelet沒有起來,現在etcd正常執行

    h00283@coverity-ms:~$ sudo systemctl status kubelet
    Warning: The unit file, source configuration file or drop-ins of kubelet.service changed on disk. Run 'systemctl daemon-reload' to reload units.
    ○ kubelet.service - kubelet: The Kubernetes Node Agent
         Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: enabled)
         Active: inactive (dead)
           Docs: http://kubernetes.io/docs/
    
    Feb 10 14:13:29 coverity-ms kubelet[9858]: E0210 14:13:29.319649    9858 kubelet_node_status.go:92] "Unable to register node with API server" err="Post \"https://lb.kubesp>
    Feb 10 14:13:31 coverity-ms kubelet[9858]: E0210 14:13:31.987813    9858 event.go:289] Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, Obje>
    Feb 10 14:13:32 coverity-ms kubelet[9858]: E0210 14:13:32.614103    9858 eviction_manager.go:258] "Eviction manager: failed to get summary stats" err="failed to get node i>
    Feb 10 14:13:33 coverity-ms kubelet[9858]: W0210 14:13:33.068103    9858 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Service: Ge>
    Feb 10 14:13:33 coverity-ms kubelet[9858]: E0210 14:13:33.068169    9858 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Service: f>
    Feb 10 14:13:33 coverity-ms systemd[1]: Stopping kubelet: The Kubernetes Node Agent...
    Feb 10 14:13:33 coverity-ms kubelet[9858]: I0210 14:13:33.123346    9858 dynamic_cafile_content.go:171] "Shutting down controller" name="client-ca-bundle::/etc/kubernetes/>
    Feb 10 14:13:33 coverity-ms systemd[1]: kubelet.service: Deactivated successfully.
    Feb 10 14:13:33 coverity-ms systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
    Feb 10 14:13:33 coverity-ms systemd[1]: kubelet.service: Consumed 1.825s CPU time.
    h00283@coverity-ms:~/ks$ sudo journalctl -u kubelet -f
    [sudo] password for h00283: 
    Feb 10 15:07:16 coverity-ms kubelet[16512]: E0210 15:07:16.938712   16512 run.go:74] "command failed" err="failed to load kubelet config file, path: /var/lib/kubelet/config.yaml, error: failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file \"/var/lib/kubelet/config.yaml\", error: open /var/lib/kubelet/config.yaml: no such file or directory"
    Feb 10 15:07:16 coverity-ms systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
    Feb 10 15:07:16 coverity-ms systemd[1]: kubelet.service: Failed with result 'exit-code'.
    Feb 10 15:07:27 coverity-ms systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 90.
    Feb 10 15:07:27 coverity-ms systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
    Feb 10 15:07:27 coverity-ms systemd[1]: Started kubelet: The Kubernetes Node Agent.
    Feb 10 15:07:27 coverity-ms kubelet[16541]: E0210 15:07:27.189671   16541 run.go:74] "command failed" err="failed to load kubelet config file, path: /var/lib/kubelet/config.yaml, error: failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file \"/var/lib/kubelet/config.yaml\", error: open /var/lib/kubelet/config.yaml: no such file or directory"
    Feb 10 15:07:27 coverity-ms systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
    Feb 10 15:07:27 coverity-ms systemd[1]: kubelet.service: Failed with result 'exit-code'.
    Feb 10 15:07:33 coverity-ms systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
    h00283@coverity-ms:~/ks$ kubeadm config images list
    W0210 16:27:02.652703   20003 version.go:104] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt": Get "https://dl.k8s.io/release/stable-1.txt": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    W0210 16:27:02.652766   20003 version.go:105] falling back to the local client version: v1.28.0
    registry.k8s.io/kube-apiserver:v1.28.0
    registry.k8s.io/kube-controller-manager:v1.28.0
    registry.k8s.io/kube-scheduler:v1.28.0
    registry.k8s.io/kube-proxy:v1.28.0
      • CauchyK零SK壹S

      • 已编辑

      cici
      可以在 init 的过程中查看 kubelet日志,或者完全失败退出后查看。

      另外容器运行时用的是不是containerd,是的话看下containerd的配置里 sandbox 设置是镜像是否可以正常 pull到

      • cici 回复了此帖

        Cauchy
        我用預設的沒改其他東西

        h00283@coverity-ms:~/ks$ sudo journalctl -u kubelet -f
        Feb 10 16:33:37 coverity-ms kubelet[20129]: E0210 16:33:37.612226   20129 kubelet_node_status.go:92] "Unable to register node with API server" err="Post \"https://lb.kubesphere.local:6443/api/v1/nodes\": dial tcp 172.1.30.21:6443: connect: connection refused" node="coverity-ms"
        Feb 10 16:33:37 coverity-ms kubelet[20129]: E0210 16:33:37.974136   20129 remote_runtime.go:193] "RunPodSandbox from runtime service failed" err="rpc error: code = DeadlineExceeded desc = failed to get sandbox image \"registry.k8s.io/pause:3.8\": failed to pull image \"registry.k8s.io/pause:3.8\": failed to pull and unpack image \"registry.k8s.io/pause:3.8\": failed to resolve reference \"registry.k8s.io/pause:3.8\": failed to do request: Head \"https://registry.k8s.io/v2/pause/manifests/3.8\": dial tcp 34.96.108.209:443: i/o timeout"
        Feb 10 16:33:37 coverity-ms kubelet[20129]: E0210 16:33:37.974191   20129 kuberuntime_sandbox.go:72] "Failed to create sandbox for pod" err="rpc error: code = DeadlineExceeded desc = failed to get sandbox image \"registry.k8s.io/pause:3.8\": failed to pull image \"registry.k8s.io/pause:3.8\": failed to pull and unpack image \"registry.k8s.io/pause:3.8\": failed to resolve reference \"registry.k8s.io/pause:3.8\": failed to do request: Head \"https://registry.k8s.io/v2/pause/manifests/3.8\": dial tcp 34.96.108.209:443: i/o timeout" pod="kube-system/kube-scheduler-coverity-ms"
        Feb 10 16:33:37 coverity-ms kubelet[20129]: E0210 16:33:37.974223   20129 kuberuntime_manager.go:1119] "CreatePodSandbox for pod failed" err="rpc error: code = DeadlineExceeded desc = failed to get sandbox image \"registry.k8s.io/pause:3.8\": failed to pull image \"registry.k8s.io/pause:3.8\": failed to pull and unpack image \"registry.k8s.io/pause:3.8\": failed to resolve reference \"registry.k8s.io/pause:3.8\": failed to do request: Head \"https://registry.k8s.io/v2/pause/manifests/3.8\": dial tcp 34.96.108.209:443: i/o timeout" pod="kube-system/kube-scheduler-coverity-ms"
        Feb 10 16:33:37 coverity-ms kubelet[20129]: E0210 16:33:37.974304   20129 pod_workers.go:1300] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"kube-scheduler-coverity-ms_kube-system(b68b9e35fcab51848c5f2ecaf37ba14d)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"kube-scheduler-coverity-ms_kube-system(b68b9e35fcab51848c5f2ecaf37ba14d)\\\": rpc error: code = DeadlineExceeded desc = failed to get sandbox image \\\"registry.k8s.io/pause:3.8\\\": failed to pull image \\\"registry.k8s.io/pause:3.8\\\": failed to pull and unpack image \\\"registry.k8s.io/pause:3.8\\\": failed to resolve reference \\\"registry.k8s.io/pause:3.8\\\": failed to do request: Head \\\"https://registry.k8s.io/v2/pause/manifests/3.8\\\": dial tcp 34.96.108.209:443: i/o timeout\"" pod="kube-system/kube-scheduler-coverity-ms" podUID="b68b9e35fcab51848c5f2ecaf37ba14d"
        Feb 10 16:33:38 coverity-ms kubelet[20129]: I0210 16:33:38.768345   20129 dynamic_cafile_content.go:171] "Shutting down controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
        Feb 10 16:33:38 coverity-ms systemd[1]: Stopping kubelet: The Kubernetes Node Agent...
        Feb 10 16:33:38 coverity-ms systemd[1]: kubelet.service: Deactivated successfully.
        Feb 10 16:33:38 coverity-ms systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
        Feb 10 16:33:38 coverity-ms systemd[1]: kubelet.service: Consumed 1.265s CPU time.

        init時查的log

          • CauchyK零SK壹S

          cici
          failed to pull image \“registry.k8s.io/pause:3.8\”
          从别的节点上 copy 一份 /etc/containerd/config.toml 到 这个节点上,然后 systemctl restart containerd, 然后卸载重装一下

          • cici 回复了此帖
            • 已编辑

            Cauchy

            照著做之後

            h00283@coverity-ms:~/ks$ sudo journalctl -u kubelet -f
            [sudo] password for h00283: 
            Feb 10 17:30:49 coverity-ms kubelet[23146]: W0210 17:30:49.347662   23146 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Node: Get "https://lb.kubesphere.local:6443/api/v1/nodes?fieldSelector=metadata.name%3Dcoverity-ms&limit=500&resourceVersion=0": dial tcp 172.1.30.21:6443: connect: connection refused
            Feb 10 17:30:49 coverity-ms kubelet[23146]: E0210 17:30:49.347716   23146 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Node: failed to list *v1.Node: Get "https://lb.kubesphere.local:6443/api/v1/nodes?fieldSelector=metadata.name%3Dcoverity-ms&limit=500&resourceVersion=0": dial tcp 172.1.30.21:6443: connect: connection refused
            Feb 10 17:30:49 coverity-ms kubelet[23146]: W0210 17:30:49.512521   23146 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Service: Get "https://lb.kubesphere.local:6443/api/v1/services?limit=500&resourceVersion=0": dial tcp 172.1.30.21:6443: connect: connection refused
            Feb 10 17:30:49 coverity-ms kubelet[23146]: E0210 17:30:49.512577   23146 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://lb.kubesphere.local:6443/api/v1/services?limit=500&resourceVersion=0": dial tcp 172.1.30.21:6443: connect: connection refused
            Feb 10 17:30:52 coverity-ms kubelet[23146]: E0210 17:30:52.186152   23146 eviction_manager.go:258] "Eviction manager: failed to get summary stats" err="failed to get node info: node \"coverity-ms\" not found"
            Feb 10 17:30:52 coverity-ms kubelet[23146]: I0210 17:30:52.327813   23146 dynamic_cafile_content.go:171] "Shutting down controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
            Feb 10 17:30:52 coverity-ms systemd[1]: Stopping kubelet: The Kubernetes Node Agent...
            Feb 10 17:30:52 coverity-ms systemd[1]: kubelet.service: Deactivated successfully.
            Feb 10 17:30:52 coverity-ms systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
            Feb 10 17:30:52 coverity-ms systemd[1]: kubelet.service: Consumed 2.138s CPU time.
            h00283@coverity-ms:~/ks$ sudo systemctl status kubelet
            Warning: The unit file, source configuration file or drop-ins of kubelet.service changed on disk. Run 'systemctl daemon-reload' to reload units.
            ○ kubelet.service - kubelet: The Kubernetes Node Agent
                 Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: enabled)
                 Active: inactive (dead)
                   Docs: http://kubernetes.io/docs/
            
            Feb 10 17:30:49 coverity-ms kubelet[23146]: W0210 17:30:49.347662   23146 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Node: Get >
            Feb 10 17:30:49 coverity-ms kubelet[23146]: E0210 17:30:49.347716   23146 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Node: fai>
            Feb 10 17:30:49 coverity-ms kubelet[23146]: W0210 17:30:49.512521   23146 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Service: G>
            Feb 10 17:30:49 coverity-ms kubelet[23146]: E0210 17:30:49.512577   23146 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Service: >
            Feb 10 17:30:52 coverity-ms kubelet[23146]: E0210 17:30:52.186152   23146 eviction_manager.go:258] "Eviction manager: failed to get summary stats" err="failed to get node >
            Feb 10 17:30:52 coverity-ms kubelet[23146]: I0210 17:30:52.327813   23146 dynamic_cafile_content.go:171] "Shutting down controller" name="client-ca-bundle::/etc/kubernetes>
            Feb 10 17:30:52 coverity-ms systemd[1]: Stopping kubelet: The Kubernetes Node Agent...
            Feb 10 17:30:52 coverity-ms systemd[1]: kubelet.service: Deactivated successfully.
            Feb 10 17:30:52 coverity-ms systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
            Feb 10 17:30:52 coverity-ms systemd[1]: kubelet.service: Consumed 2.138s CPU time.
            lines 1-16/16 (END)
              • CauchyK零SK壹S

              cici
              crcitl ps 看看有没有 kube-apiserver,没有的话需要继续在kubelet日志里找为什么没有创建出来这个container的原因,先不要管那些 6443 connection refused 的日志,kube-apiserver起来之后就没那些日志了

              • cici 回复了此帖

                Cauchy 我發現docker image很多沒拉下來成功導致kubelet kubeadm containerd等等的沒在運作

                已經成功了!

                過程:
                把相關安裝的工具都先移除,資料夾也是

                發現

                1. etcd 跟 containerd 本來就系統安裝過,導致版本有問題 => 移除安裝

                2. kubelet、kubectl 未安裝完成,資料夾裡缺少相關配置 => 資料夾刪除乾淨

                3. sudo ./kk delete cluster -f config-sample.yaml

                4. 移除 kubekey 資料夾

                5. sudo ./kk create cluster -f config-sample.yaml -a kubesphere.tar.gz --with-local-storage

                6. 執行到 [init cluster using kubeadm] 的時候:
                  (1) 確認 etcd 狀態 sudo systemctl status etcd => 確定沒有錯誤訊息

                  (2) 確認 containerd 狀態 sudo systemctl status containerd

                    <遇到鏡像拉取失敗問題>
                    檢查有沒有這份文件 /etc/containerd/config.toml => 沒有的話從有的節點複製過去,重複上面 delete cluster 動作,還是有報錯的話繼續解決,直到沒有錯的時候應該就能順利成功 create cluster 了

                7. create cluster 完成後就可以安裝 KubeSphere

                  13 天 后

                  cici 您好,我想问一下,怎么解决kubelet起不来的问题,你说的删除资料夹具体指哪些呢,麻烦大佬指点一下

                  • cici 回复了此帖

                    zxcccom
                    Hi,我是參考這篇文章https://blog.csdn.net/qq_40184595/article/details/129439402

                    照著把東西清一清

                    which kubelet 指令下下去資料夾也刪掉(我kubectl等等的也都有做)

                    然後kk delete cluster跟把kubekey資料夾移掉重新 create cluster

                    -

                    也要注意到 init kubeadm 時候可以去看一下你的etcd、containerd、kubelet、kubelet.service 有沒有報錯!

                    我的 /etc/containerd/config.toml 沒有自己建立起來,加進去之後也順利create cluste,你也能檢查一下

                      cici 您好,我都检查了,kubelet能跑起来了,但是kubelet报了一个Attempting to register node" node="node1,摸不着头脑哦