hongming
@johnniang
@DehaoCheng
非常感谢三位大佬帮忙解答这个问题,这个问题的根本原因就是 kube-controller-manager 导致的。
我的集群是三个 master 节点的高可用模式,在三天之前进行了升级操作,kubesphere由 v3.1.1 升级至 v3.2.1,kubernetes 由 v1.18.8 升级至 v1.19.8。刚刚查看了 kube-controller-manager 日志发现 master03 节点的 kube-controller-manager 升级之后启动报错了。日志信息如下
Flag --experimental-cluster-signing-duration has been deprecated, use --cluster-signing-duration
Flag --port has been deprecated, see --secure-port instead.
I1221 21:41:55.367887 1 serving.go:331] Generated self-signed cert in-memory
I1221 21:41:55.695667 1 controllermanager.go:175] Version: v1.19.8
I1221 21:41:55.697103 1 secure_serving.go:197] Serving securely on [::]:10257
I1221 21:41:55.698631 1 leaderelection.go:243] attempting to acquire leader lease kube-system/kube-controller-manager...
I1221 21:41:55.702499 1 dynamic_cafile_content.go:167] Starting request-header::/etc/kubernetes/pki/front-proxy-ca.crt
I1221 21:41:55.702570 1 dynamic_cafile_content.go:167] Starting client-ca-bundle::/etc/kubernetes/pki/ca.crt
I1221 21:41:55.702666 1 tlsconfig.go:240] Starting DynamicServingCertificateController
E1221 21:41:58.492775 1 leaderelection.go:325] error retrieving resource lock kube-system/kube-controller-manager: endpoints "kube-controller-manager" is forbidden: User "system:kube-controller-manager" cannot get resource "endpoints" in API group "" in the namespace "kube-system"
我尝试使用 kubectl delete pods kube-controller-manager-master03 命令重启这个容器,查看日志发现实际并没有重启成功,最后登录 master03 节点,使用 docker restart 命令重启成功,启动日志信息也没有报错。
Flag --experimental-cluster-signing-duration has been deprecated, use --cluster-signing-duration
Flag --port has been deprecated, see --secure-port instead.
I1224 16:12:27.777365 1 serving.go:331] Generated self-signed cert in-memory
I1224 16:12:27.985002 1 controllermanager.go:175] Version: v1.19.8
I1224 16:12:27.985851 1 dynamic_cafile_content.go:167] Starting request-header::/etc/kubernetes/pki/front-proxy-ca.crt
I1224 16:12:27.985908 1 dynamic_cafile_content.go:167] Starting client-ca-bundle::/etc/kubernetes/pki/ca.crt
I1224 16:12:27.986231 1 secure_serving.go:197] Serving securely on [::]:10257
I1224 16:12:27.986279 1 leaderelection.go:243] attempting to acquire leader lease kube-system/kube-controller-manager...
I1224 16:12:27.986303 1 tlsconfig.go:240] Starting DynamicServingCertificateController
之前堆积的 csr 也全部重新签发。使用 kubectl get certificatesigningrequests.certificates.k8s.io 命令已经看不到之前的 csr
我又查看了新创建用户的 kubeconfig 文件,users 字段也有了完整内容,使用这个 kubeconfig 也可以正常访问 K8S 的 kube-apiserver
