操作系统信息
虚拟机,Centos7.5
离线部署k8s,已经安装好了私有的docker仓库,所需的所有的images也下载到了仓库中。docker的版本是v20.10.23,docker改成了systemd,swap已经关闭。
执行的命令是:
./kk create cluster -f config-sample.yaml
配置文件中的部分信息如下:
controlPlaneEndpoint:
domain: lb.kubesphere.local
address: "82.202.16.197"
port: 6443
kubernetes:
version: v1.20.6
imageRepo: kubesphere
clusterName: test
maxPods: 200
nodeCidrMaskSize: 24
network:
plugin: calico
calico:
ipipMode: Always
vxlanMode: Never
vethMTU: 1440
kubePodsCIDR: 10.233.64.0/18
kubeServiceCIDR: 10.233.0.0/18
registry:
registryMirrors: ["https://82.202.38.138"]
insecureRegistries: []
privateRegistry: 82.202.38.138
addons: []
我在使用kk安装k8s v1.20.6 的时候,报错信息为:
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher: Process exited with status 1
根据报错信息,我排查了kubelet
状态,报错信息如下:
reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Node: failed to list *v1.Node: Get "https://lb.kubesphere.local:6443/api/v1/nodes?fieldSelector=metadata.name%3Dt10ax0k8smaster01&limit=500&resourceVersion=0": dial tcp 82.202.16.197:6443: i/o timeout
kubelet.go:449] kubelet nodes not sync
kubelet.go:449] kubelet nodes not sync
kubelet.go:449] kubelet nodes not sync
kubelet.go:2188] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
trace.go:205] Trace[1220375059]: "Reflector ListAndWatch" name:k8s.io/client-go/informers/factory.go:134 (11-Mar-2023 18:09:54.905) (total time: 30000ms):
Mar 11 18:10:24 t10ax0k8smaster01 kubelet[26315]: Trace[1220375059]: [30.000373997s] [30.000373997s] END
reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://lb.kubesphere.local:6443/api/v1/services?limit=500&resourceVersion=0": dial tcp 82.202.16.197:6443: i/o timeout
kubelet.go:449] kubelet nodes not sync
kubelet.go:449] kubelet nodes not sync
certificate_manager.go:437] Failed while requesting a signed certificate from the master: cannot create certificate signing request: Post "https://lb.kubesphere.local:6443/apis/certificates.k8s.io/v1/certificatesigningrequests": dial tcp 82.202.16.197:6443: i/o timeout
kubelet.go:449] kubelet nodes not sync
kubelet.go:449] kubelet nodes not sync
cni.go:239] Unable to update cni config: no networks found in /etc/cni/net.d
报错的原因是reason:NetworkPluginNotReady
,但不知道是哪个阶段会安装网络的这个插件。另外,我排查了docker的进程,目前有三个进程,docker镜像log中没有报错日志:
kube-scheduler
kube-controller-manager
kube-apiserver
cni的插件放在了/opt/kubekey/cni/v0.9.1/amd64/cni-plugins-linux-amd64-v0.9.1.tgz
请教,这个问题如何节点,多谢