多master节点多node节点的 Kubesphere升级V2.1.1后,master0节点在凌晨出现Notready状态,进去发现kubelet服务已经不能启动,重启kubelet服务依然失败,信息如下:

[root@master0 conf]# systemctl status kubelet.service -l
● kubelet.service - Kubernetes Kubelet Server
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
   Active: activating (auto-restart) (Result: exit-code) since Fri 2020-11-06 12:05:20 CST; 7s ago
     Docs: https://github.com/GoogleCloudPlatform/kubernetes
  Process: 10669 ExecStart=/usr/local/bin/kubelet $KUBE_LOGTOSTDERR $KUBE_LOG_LEVEL $KUBELET_API_SERVER $KUBELET_ADDRESS $KUBELET_PORT $KUBELET_HOSTNAME $KUBELET_ARGS $DOCKER_SOCKET $KUBELET_NETWORK_PLUGIN $KUBELET_VOLUME_PLUGIN $KUBELET_CLOUDPROVIDER (code=exited, status=255)
 Main PID: 10669 (code=exited, status=255)

Nov 06 12:05:20 master0 kubelet[10669]: I1106 12:05:20.774709   10669 feature_gate.go:216] feature gates: &{map[CSINodeInfo:true ExpandCSIVolumes:true RotateKubeletClientCertificate:true VolumeSnapshotDataSource:true]}
Nov 06 12:05:20 master0 kubelet[10669]: I1106 12:05:20.781280   10669 mount_linux.go:168] Detected OS with systemd
Nov 06 12:05:20 master0 kubelet[10669]: I1106 12:05:20.781490   10669 server.go:410] Version: v1.16.7
Nov 06 12:05:20 master0 kubelet[10669]: I1106 12:05:20.781708   10669 feature_gate.go:216] feature gates: &{map[CSINodeInfo:true ExpandCSIVolumes:true RotateKubeletClientCertificate:true VolumeSnapshotDataSource:true]}
Nov 06 12:05:20 master0 kubelet[10669]: I1106 12:05:20.781908   10669 feature_gate.go:216] feature gates: &{map[CSINodeInfo:true ExpandCSIVolumes:true RotateKubeletClientCertificate:true VolumeSnapshotDataSource:true]}
Nov 06 12:05:20 master0 kubelet[10669]: I1106 12:05:20.782068   10669 plugins.go:100] No cloud provider specified.
Nov 06 12:05:20 master0 kubelet[10669]: I1106 12:05:20.782113   10669 server.go:526] No cloud provider specified: "" from the config file: ""
Nov 06 12:05:20 master0 kubelet[10669]: I1106 12:05:20.782142   10669 server.go:773] Client rotation is on, will bootstrap in background
Nov 06 12:05:20 master0 kubelet[10669]: E1106 12:05:20.786649   10669 bootstrap.go:265] part of the existing bootstrap client certificate is expired: 2020-10-24 09:40:40 +0000 UTC
Nov 06 12:05:20 master0 kubelet[10669]: F1106 12:05:20.786706   10669 server.go:271] failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory

重启服务器后,依然不能启动kubelet服务器

@Feynman 大神帮忙看一下是什么原因吧

    yuswift

    10月份刚升级的V2.1.1 ,不是说升级后就可以解决证书过期的问题吗?

    能看出来是哪个证书还没有更新吗?

    yuswift
    找到了一个过期的证书:/etc/kubernetes/ssl/expired/apiserver.crt
    不知道这个证书为什么一个集群影响很大,还有个集群也过期了但是没有感觉到影响服务

      zhangyanmaster0 not ready的话 其他的master会对外正常服务 集群对外临时可用 你说的业务是不是master0部署了业务 被影响了

        yuswift
        master0 未部署业务,凌晨5点多突然自己变为NotReady状态

        4 个月 后

        yuswift 大佬,单master环境,过期后按照以下方案执行后:

        cd /etc/kubernetes
        kubeadm alpha certs renew apiserver
        kubeadm alpha certs renew apiserver-kubelet-client
        kubeadm alpha certs renew front-proxy-client
        kubeadm alpha certs renew admin.conf
        kubeadm alpha certs renew controller-manager.conf
        kubeadm alpha certs renew scheduler.conf
        docker ps -af name=k8s_kube-apiserver* -q | xargs --no-run-if-empty docker rm -f
        docker ps -af name=k8s_kube-scheduler* -q | xargs --no-run-if-empty docker rm -f
        docker ps -af name=k8s_kube-controller-manager* -q | xargs --no-run-if-empty docker rm -f
        systemctl restart kubelet

        无法启动了kubelet服务,这个是什么情况呢

        [root@master kubernetes]# systemctl status kubelet
        ● kubelet.service - Kubernetes Kubelet Server
           Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
           Active: activating (auto-restart) (Result: exit-code) since Tue 2021-02-23 22:41:29 CST; 3s ago
             Docs: https://github.com/GoogleCloudPlatform/kubernetes
          Process: 10283 ExecStart=/usr/local/bin/kubelet $KUBE_LOGTOSTDERR $KUBE_LOG_LEVEL $KUBELET_API_SERVER $KUBELET_ADDRESS $KUBELET_PORT $KUBELET_HOSTNAME $KUBELET_ARGS $DOCKER_SOCKET $KUBELET_NETWORK_PLUGIN $KUBELET_VOLUME_PLUGIN $KUBELET_CLOUDPROVIDER (code=exited, status=255)
         Main PID: 10283 (code=exited, status=255)
        
        Feb 23 22:41:29 master kubelet[10283]: I0223 22:41:29.235783   10283 mount_linux.go:168] Detected OS with systemd
        Feb 23 22:41:29 master kubelet[10283]: I0223 22:41:29.242181   10283 server.go:410] Version: v1.16.7
        Feb 23 22:41:29 master kubelet[10283]: I0223 22:41:29.242285   10283 feature_gate.go:216] feature gates: &{map[CSINodeInfo:true ExpandCSIVolumes:true RotateKubeletClientCertificate:true Volume...aSource:true]}
        Feb 23 22:41:29 master kubelet[10283]: I0223 22:41:29.242357   10283 feature_gate.go:216] feature gates: &{map[CSINodeInfo:true ExpandCSIVolumes:true RotateKubeletClientCertificate:true Volume...aSource:true]}
        Feb 23 22:41:29 master kubelet[10283]: I0223 22:41:29.242590   10283 plugins.go:100] No cloud provider specified.
        Feb 23 22:41:29 master kubelet[10283]: I0223 22:41:29.242622   10283 server.go:526] No cloud provider specified: "" from the config file: ""
        Feb 23 22:41:29 master kubelet[10283]: I0223 22:41:29.242642   10283 server.go:773] Client rotation is on, will bootstrap in background
        Feb 23 22:41:29 master kubelet[10283]: E0223 22:41:29.245192   10283 bootstrap.go:265] part of the existing bootstrap client certificate is expired: 2020-12-03 08:55:59 +0000 UTC
        Feb 23 22:41:29 master kubelet[10283]: F0223 22:41:29.245269   10283 server.go:271] failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no ...e or directory
        Feb 23 22:41:29 master systemd[1]: kubelet.service failed.
        Hint: Some lines were ellipsized, use -l to show in full.
        • Jeff 回复了此帖

          rysinal K8s的问题网上很多帖子的,你把错误贴到网上搜下,证书更新参考k8s官网的帖子

            Jeff 是参考了https://kubesphere.com.cn/forum/d/3177-kubesphere-v211k8s/5 这个帖子执行的,但是未生效,网上其他方案也试了,目前一直报:part of the existing bootstrap client certificate is expired: 2020-12-03 08:55:59 +0000 UTC

            过期不可用是2021 02 24这天过期的,更新后也都查了是对的,所以请教下是不是有哪部分被忽略了呢:

            [root@master ~]# kubeadm alpha certs check-expiration --config /etc/kubernetes/kubeadm-config.yaml
            CERTIFICATE                EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
            admin.conf                 Feb 23, 2022 14:35 UTC   363d            no
            apiserver                  Feb 23, 2022 14:35 UTC   363d            no
            apiserver-kubelet-client   Feb 23, 2022 14:35 UTC   363d            no
            controller-manager.conf    Feb 23, 2022 14:37 UTC   363d            no
            front-proxy-client         Feb 23, 2022 14:35 UTC   363d            no
            scheduler.conf             Feb 23, 2022 14:37 UTC   363d            no
              5 个月 后
              4 天 后

              zhangyan

              mkdir /etc/kubernetes.bak
              cp -r /etc/kubernetes/pki/ /etc/kubernetes.bak
              cp /etc/kubernetes/*.conf /etc/kubernetes.bak
              
              kubeadm alpha certs renew all --config /etc/kubernetes/kubeadm-config.yaml
              
              kubeadm alpha kubeconfig user --client-name=admin
              kubeadm alpha kubeconfig user --org system:masters --client-name kubernetes-admin  > /etc/kubernetes/admin.conf
              kubeadm alpha kubeconfig user --client-name system:kube-controller-manager > /etc/kubernetes/controller-manager.conf
              kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) > /etc/kubernetes/kubelet.conf
              kubeadm alpha kubeconfig user --client-name system:kube-scheduler > /etc/kubernetes/scheduler.conf
              
              kubeadm alpha certs renew apiserver-kubelet-client --config /etc/kubernetes/kubeadm-config.yaml
              kubeadm alpha certs renew front-proxy-client --config /etc/kubernetes/kubeadm-config.yaml
              
              docker ps -af name=k8s_kube-apiserver* -q | xargs --no-run-if-empty docker rm -f
              docker ps -af name=k8s_kube-scheduler* -q | xargs --no-run-if-empty docker rm -f
              docker ps -af name=k8s_kube-controller-manager* -q | xargs --no-run-if-empty docker rm -f
              systemctl restart kubelet
              
              kubeadm alpha kubeconfig user --org system:masters --client-name kubernetes-admin > ~/.kube/config

              每个master节点分别执行这个升级