kubesphere v3.3.2 kubernetes v1.22.12 在线全套安装失败

kidd-u · 2023年6月5日

操作系统信息
例如：物理机，Ubuntu22.04，8C/16G

Kubernetes版本信息
例如：v1.22.12。多节点。

容器运行时
docker 20.10.13

KubeSphere版本信息

v3.3.2

问题：安装到最后的 Please wait for the installation to complete: >>—> 后一直等待直到2小时后失败

也无法打开KubeSphere的Console web界面

1、查看节点发现准备就绪正常的

2、发现一部分工作负载无法启动

3、查看该工作负载ks-installer日志，发现连接10.233.0.1:443超时

4、在物理机上 ping和telnet 10.233.0.1:443 可达，没有问题

5、直接 wget https://10.233.0.1:443/api?timeout=32s，发现好像是‘CN=kubernetes’ 颁发的证书无法通过验证

6、手动 wget https://10.233.0.1:443/api?timeout=32s –no-check-certificate 后返回 403

请问各位大佬，安装失败和这个有关系吗？要如何解决呢？麻烦帮帮忙

kidd-u · 2023年6月5日

我看了一下所有跑不起来的pod都是因为10.233.0.1:443: i/o timeout。但是防火墙都是关的，ping和telnet也没问题。请问要怎么解决啊？

Ggary · 2023年6月7日

遇到同样的问题，卡在这里2小时后失败了

Please wait for the installation to complete: >>—>

16:41:30 CST failed: [master]

error: Pipeline[CreateClusterPipeline] execute failed: Module[CheckResultModule] exec failed:

failed: [master] execute task timeout, Timeout=2h

Ggary · 2023年6月9日

gary 服务器环境问题，腾讯云的服务器一直出这个错。换了阿里云服务器安装正常

jiangguilong2000 · 2023年7月8日

我解决了，也是腾讯云的机器，解决很简单，公布答案~

查看k8s事件:

kubectl get events -n kubesphere-system

LAST SEEN TYPE REASON OBJECT MESSAGE

3m20s Warning FailedScheduling pod/ks-installer-645578d874-wjtct 0/1 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn’t tolerate.

这说明了什么呢？

报错原因是因为默认的调度器将pod调度到了master节点；而默认情况下，为了安全考虑，K8S是不允许将pod调度到master节点，于是报了这个错。

命令修改可以调度到master节点

1）对单机环境而言，通常是单机测试环境，我们可以修改：让pod可以调度到master节点，命令如下：

kubectl taint nodes –all node-role.kubernetes.io/master-

node/vm-0-11-centos untainted

回头，再去执行，就OK了

./kk create cluster –with-kubernetes v1.22.12 –with-kubesphere v3.3.2

Lloveplxf · 2023年11月15日

分享一下我的问题吧

表面报错：

Please wait for the installation to complete: >>—>

16:41:30 CST failed: [master]

error: Pipeline[CreateClusterPipeline] execute failed: Module[CheckResultModule] exec failed:

failed: [master] execute task timeout, Timeout=2h

原因分析：

执行：journalctl -fu kubelet

报错：11月 15 11:28:50 tmaster2 kubelet[4106]: I1115 11:28:50.675831 4106 cni.go:239] “Unable to update cni config” err=“no networks found in /etc/cni/net.d”

11月 15 11:28:52 tmaster2 kubelet[4106]: E1115 11:28:52.125041 4106 kubelet.go:2381] “Container runtime network not ready” networkReady=“NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized”

11月 15 11:28:55 tmaster2 kubelet[4106]: I1115 11:28:55.676297 4106 cni.go:239] “Unable to update cni config” err=“no networks found in /etc/cni/net.d”

11月 15 11:28:57 tmaster2 kubelet[4106]: E1115 11:28:57.134508 4106 kubelet.go:2381] “Container runtime network not ready” networkReady=“NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized”

11月 15 11:28:57 tmaster2 kubelet[4106]: E1115 11:28:57.690914 4106 kubelet.go:1751] “Unable to attach or mount volumes for pod; skipping pod” err=“unmounted volumes=[bpffs], unattached volumes=[kube-api-access-2grvf cni-net-dir policysync cni-bin-dir var-run-calico lib-modules cni-log-dir host-local-net-dir xtables-lock var-lib-calico bpffs sys-fs nodeproc]: timed out waiting for the condition” pod=“kube-system/calico-node-tsqpc”

11月 15 11:28:57 tmaster2 kubelet[4106]: E1115 11:28:57.690970 4106 pod_workers.go:951] “Error syncing pod, skipping” err=“unmounted volumes=[bpffs], unattached volumes=[kube-api-access-2grvf cni-net-dir policysync cni-bin-dir var-run-calico lib-modules cni-log-dir host-local-net-dir xtables-lock var-lib-calico bpffs sys-fs nodeproc]: timed out waiting for the condition” pod=“kube-system/calico-node-tsqpc” podUID=cad158b0-d3d2-4ea2-b9a6-9a4b31d1404f

分析：k8s 1.22.12版本太新，我的操作系统是centos7.5，内核版本太低，安装calico失败，需要升级内核至4.4以上

uname -r 我的版本是：3.10.0-862.el7.x86_64

Lloveplxf · 2023年11月15日

loveplxf 升级内核参见这篇文章：

https://www.cnblogs.com/Pigs-Will-Fly/p/17592712.html