创建部署问题时,请参考下面模板,你提供的信息越多,越容易及时获得解答。
你只花一分钟创建的问题,不能指望别人花上半个小时给你解答。
发帖前请点击 发表主题 右边的 预览(👀) 按钮,确保帖子格式正确。
操作系统是Centos7.9
Kubernetes版本: v1.23.7 多节点。
KubeSphere版本: v3.3.0
问题是什么:
使用kk多节点在线安装出现错误 [RestartETCD] exec failed after 3 retires: start etcd failed: Failed to exec command: sudo -E /bin/bash -c “systemctl daemon-reload && systemctl restart etcd && systemctl enable etcd” Job for etcd.service failed because a timeout was exceeded. See “systemctl status etcd.service” and “journalctl -xe” for details.: Process exited with status 1
安装日志如下
[root@VM-8-6-centos kk]# ./kk create cluster -f config-
sample.yaml
_ __ _ _ __
| | / / | | | | / /
| |/ / _ _| |__ ___| |/ / ___ _ _
| \| | | | '_ \ / _ \ \ / _ \ | | |
| |\ \ |_| | |_) | __/ |\ \ __/ |_| |
\_| \_/\__,_|_.__/ \___\_| \_/\___|\__, |
__/ |
|___/
22:31:39 CST [GreetingsModule] Greetings
yes
22:31:43 CST message: [node3]
Greetings, KubeKey!
22:31:46 CST message: [plane]
Greetings, KubeKey!
22:31:50 CST message: [node1]
Greetings, KubeKey!
22:31:54 CST message: [node2]
Greetings, KubeKey!
22:31:54 CST success: [node3]
22:31:54 CST success: [plane]
22:31:54 CST success: [node1]
22:31:54 CST success: [node2]
22:31:54 CST [NodePreCheckModule] A pre-check on nodes
22:32:10 CST success: [plane]
22:32:10 CST success: [node2]
22:32:10 CST success: [node3]
22:32:10 CST success: [node1]
22:32:10 CST [ConfirmModule] Display confirmation form
+---------+------+------+---------+----------+-------+-
------+---------+-----------+--------+--------+--------
----+------------+-------------+------------------+----
----------+
| name | sudo | curl | openssl | ebtables | socat |
ipset | ipvsadm | conntrack | chrony | docker | contain
erd | nfs client | ceph client | glusterfs client | tim
e |
+---------+------+------+---------+----------+-------+-
------+---------+-----------+--------+--------+--------
----+------------+-------------+------------------+----
----------+
| plane | y | y | y | y | y |
y | | y | y | |
| | | | EDT
10:32:06 |
| node1 | y | y | y | y | y |
y | | y | y | |
| | | | EDT
10:32:10 |
| node2 | y | y | y | y | y |
y | | y | y | |
| | | | EDT
10:32:09 |
| node3 | y | y | y | y | y |
y | | y | y | |
| | | | EDT
10:32:10 |
+---------+------+------+---------+----------+-------+-
------+---------+-----------+--------+--------+--------
----+------------+-------------+------------------+----
----------+
This is a simple check of your environment.
Before installation, ensure that your machines meet all
requirements specified at
https://github.com/kubesphere/kubekey#requirements-and-
recommendations
Continue this installation? [yes/no]: 22:32:10 CST succ
ess: [LocalHost]
22:32:10 CST [NodeBinariesModule] Download installation
binaries
22:32:10 CST message: [localhost]
downloading amd64 kubeadm v1.23.7 ...
22:32:10 CST message: [localhost]
kubeadm is existed
22:32:10 CST message: [localhost]
downloading amd64 kubelet v1.23.7 ...
22:32:11 CST message: [localhost]
kubelet is existed
22:32:11 CST message: [localhost]
downloading amd64 kubectl v1.23.7 ...
22:32:12 CST message: [localhost]
kubectl is existed
22:32:12 CST message: [localhost]
downloading amd64 helm v3.6.3 ...
22:32:12 CST message: [localhost]
helm is existed
22:32:12 CST message: [localhost]
downloading amd64 kubecni v0.9.1 ...
22:32:12 CST message: [localhost]
kubecni is existed
22:32:12 CST message: [localhost]
downloading amd64 crictl v1.24.0 ...
22:32:12 CST message: [localhost]
crictl is existed
22:32:12 CST message: [localhost]
downloading amd64 etcd v3.4.13 ...
22:32:12 CST message: [localhost]
etcd is existed
22:32:12 CST message: [localhost]
downloading amd64 docker 20.10.8 ...
22:32:13 CST message: [localhost]
docker is existed
22:32:13 CST success: [LocalHost]
22:32:13 CST [ConfigureOSModule] Prepare to init OS
22:32:34 CST success: [plane]
22:32:34 CST success: [node2]
22:32:34 CST success: [node1]
22:32:34 CST success: [node3]
22:32:34 CST [ConfigureOSModule] Generate init os scrip
t
22:32:41 CST success: [plane]
22:32:41 CST success: [node3]
22:32:41 CST success: [node1]
22:32:41 CST success: [node2]
22:32:41 CST [ConfigureOSModule] Exec init os script
22:32:43 CST stdout: [node2]
Permissive
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_local_reserved_ports = 30000-32767
vm.max_map_count = 262144
vm.swappiness = 1
fs.inotify.max_user_instances = 524288
kernel.pid_max = 65535
22:32:43 CST stdout: [node1]
Permissive
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_local_reserved_ports = 30000-32767
vm.max_map_count = 262144
vm.swappiness = 1
fs.inotify.max_user_instances = 524288
kernel.pid_max = 65535
22:32:44 CST stdout: [node3]
Permissive
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_local_reserved_ports = 30000-32767
vm.max_map_count = 262144
vm.swappiness = 1
fs.inotify.max_user_instances = 524288
kernel.pid_max = 65535
22:32:48 CST stdout: [plane]
Permissive
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_local_reserved_ports = 30000-32767
vm.max_map_count = 262144
vm.swappiness = 1
fs.inotify.max_user_instances = 524288
kernel.pid_max = 65535
22:32:48 CST success: [node2]
22:32:48 CST success: [node1]
22:32:48 CST success: [node3]
22:32:48 CST success: [plane]
22:32:48 CST [ConfigureOSModule] configure the ntp serv
er for each node
22:32:48 CST skipped: [node3]
22:32:48 CST skipped: [plane]
22:32:48 CST skipped: [node1]
22:32:48 CST skipped: [node2]
22:32:48 CST [KubernetesStatusModule] Get kubernetes cl
uster status
22:32:50 CST success: [plane]
22:32:50 CST [InstallContainerModule] Sync docker binar
ies
22:53:47 CST success: [plane]
22:53:47 CST success: [node2]
22:53:47 CST success: [node3]
22:53:47 CST success: [node1]
22:53:47 CST [InstallContainerModule] Generate docker s
ervice
22:53:55 CST success: [plane]
22:53:55 CST success: [node2]
22:53:55 CST success: [node3]
22:53:55 CST success: [node1]
22:53:55 CST [InstallContainerModule] Generate docker c
onfig
22:54:03 CST success: [plane]
22:54:03 CST success: [node2]
22:54:03 CST success: [node1]
22:54:03 CST success: [node3]
22:54:03 CST [InstallContainerModule] Enable docker
22:54:08 CST success: [node2]
22:54:08 CST success: [node1]
22:54:08 CST success: [node3]
22:54:08 CST success: [plane]
22:54:08 CST [InstallContainerModule] Add auths to cont
ainer runtime
22:54:09 CST skipped: [plane]
22:54:09 CST skipped: [node2]
22:54:09 CST skipped: [node3]
22:54:09 CST skipped: [node1]
22:54:09 CST [PullModule] Start to pull images on all n
odes
22:54:09 CST message: [node3]
downloading image: kubesphere/pause:3.6
22:54:09 CST message: [plane]
downloading image: kubesphere/pause:3.6
22:54:09 CST message: [node1]
downloading image: kubesphere/pause:3.6
22:54:09 CST message: [node2]
downloading image: kubesphere/pause:3.6
22:54:11 CST message: [plane]
downloading image: kubesphere/kube-apiserver:v1.23.7
22:54:15 CST message: [node2]
downloading image: kubesphere/kube-proxy:v1.23.7
22:54:15 CST message: [node3]
downloading image: kubesphere/kube-proxy:v1.23.7
22:54:15 CST message: [node1]
downloading image: kubesphere/kube-proxy:v1.23.7
22:54:17 CST message: [plane]
downloading image: kubesphere/kube-controller-manager:v
1.23.7
22:54:23 CST message: [plane]
downloading image: kubesphere/kube-scheduler:v1.23.7
22:54:27 CST message: [plane]
downloading image: kubesphere/kube-proxy:v1.23.7
22:54:34 CST message: [plane]
downloading image: coredns/coredns:1.8.6
22:54:35 CST message: [node1]
downloading image: coredns/coredns:1.8.6
22:54:37 CST message: [node2]
downloading image: coredns/coredns:1.8.6
22:54:37 CST message: [node3]
downloading image: coredns/coredns:1.8.6
22:54:38 CST message: [plane]
downloading image: kubesphere/k8s-dns-node-cache:1.15.1
2
22:54:44 CST message: [node1]
downloading image: kubesphere/k8s-dns-node-cache:1.15.1
2
22:54:45 CST message: [plane]
downloading image: calico/kube-controllers:v3.20.0
22:54:46 CST message: [node2]
downloading image: kubesphere/k8s-dns-node-cache:1.15.1
2
22:54:49 CST message: [node3]
downloading image: kubesphere/k8s-dns-node-cache:1.15.1
2
22:54:51 CST message: [plane]
downloading image: calico/cni:v3.20.0
22:55:01 CST message: [plane]
downloading image: calico/node:v3.20.0
22:55:05 CST message: [node2]
downloading image: calico/kube-controllers:v3.20.0
22:55:08 CST message: [node1]
downloading image: calico/kube-controllers:v3.20.0
22:55:11 CST message: [node3]
downloading image: calico/kube-controllers:v3.20.0
22:55:12 CST message: [plane]
downloading image: calico/pod2daemon-flexvol:v3.20.0
22:55:19 CST message: [node2]
downloading image: calico/cni:v3.20.0
22:55:22 CST message: [node1]
downloading image: calico/cni:v3.20.0
22:55:26 CST message: [node3]
downloading image: calico/cni:v3.20.0
22:55:42 CST message: [node2]
downloading image: calico/node:v3.20.0
22:55:46 CST message: [node1]
downloading image: calico/node:v3.20.0
22:55:51 CST message: [node3]
downloading image: calico/node:v3.20.0
22:56:09 CST message: [node2]
downloading image: calico/pod2daemon-flexvol:v3.20.0
22:56:15 CST message: [node1]
downloading image: calico/pod2daemon-flexvol:v3.20.0
22:56:23 CST message: [node3]
downloading image: calico/pod2daemon-flexvol:v3.20.0
22:56:31 CST success: [plane]
22:56:31 CST success: [node2]
22:56:31 CST success: [node1]
22:56:31 CST success: [node3]
22:56:31 CST [ETCDPreCheckModule] Get etcd status
22:56:34 CST success: [node1]
22:56:34 CST success: [node2]
22:56:34 CST success: [node3]
22:56:34 CST [CertsModule] Fetch etcd certs
22:56:34 CST success: [node1]
22:56:34 CST skipped: [node2]
22:56:34 CST skipped: [node3]
22:56:34 CST [CertsModule] Generate etcd Certs
[certs] Using existing ca certificate authority
[certs] Using existing node-plane certificate and key
on disk
[certs] Using existing admin-node1 certificate and ke
y on disk
[certs] Using existing member-node1 certificate and k
ey on disk
[certs] Using existing admin-node2 certificate and ke
y on disk
[certs] Using existing member-node2 certificate and k
ey on disk
[certs] Using existing admin-node3 certificate and ke
y on disk
[certs] Using existing member-node3 certificate and k
ey on disk
22:56:34 CST success: [LocalHost]
22:56:34 CST [CertsModule] Synchronize certs file
22:58:29 CST success: [node2]
22:58:29 CST success: [node3]
22:58:29 CST success: [node1]
22:58:29 CST [CertsModule] Synchronize certs file to ma
ster
22:59:50 CST success: [plane]
22:59:50 CST [InstallETCDBinaryModule] Install etcd usi
ng binary
23:05:50 CST success: [node2]
23:05:50 CST success: [node1]
23:05:50 CST success: [node3]
23:05:50 CST [InstallETCDBinaryModule] Generate etcd se
rvice
23:05:57 CST success: [node2]
23:05:57 CST success: [node3]
23:05:57 CST success: [node1]
23:05:57 CST [InstallETCDBinaryModule] Generate access
address
23:05:57 CST skipped: [node3]
23:05:57 CST success: [node1]
23:05:57 CST skipped: [node2]
23:05:57 CST [ETCDConfigureModule] Health check on exis
t etcd
23:05:57 CST skipped: [node3]
23:05:57 CST skipped: [node1]
23:05:57 CST skipped: [node2]
23:05:57 CST [ETCDConfigureModule] Generate etcd.env co
nfig on new etcd
23:06:18 CST success: [node1]
23:06:18 CST success: [node2]
23:06:18 CST success: [node3]
23:06:18 CST [ETCDConfigureModule] Refresh etcd.env con
fig on all etcd
23:06:39 CST success: [node1]
23:06:39 CST success: [node2]
23:06:39 CST success: [node3]
23:06:39 CST [ETCDConfigureModule] Restart etcd
23:08:10 CST stdout: [node2]
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.
23:08:10 CST message: [node2]
start etcd failed: Failed to exec command: sudo -E /bin
/bash -c "systemctl daemon-reload && systemctl restart
etcd && systemctl enable etcd"
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.: Process exited with status 1
23:08:10 CST retry: [node2]
23:08:10 CST stdout: [node1]
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.
23:08:10 CST message: [node1]
start etcd failed: Failed to exec command: sudo -E /bin
/bash -c "systemctl daemon-reload && systemctl restart
etcd && systemctl enable etcd"
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.: Process exited with status 1
23:08:10 CST retry: [node1]
23:08:11 CST stdout: [node3]
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.
23:08:11 CST message: [node3]
start etcd failed: Failed to exec command: sudo -E /bin
/bash -c "systemctl daemon-reload && systemctl restart
etcd && systemctl enable etcd"
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.: Process exited with status 1
23:08:11 CST retry: [node3]
23:09:47 CST stdout: [node2]
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.
23:09:47 CST message: [node2]
start etcd failed: Failed to exec command: sudo -E /bin
/bash -c "systemctl daemon-reload && systemctl restart
etcd && systemctl enable etcd"
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.: Process exited with status 1
23:09:47 CST retry: [node2]
23:09:47 CST stdout: [node1]
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.
23:09:47 CST message: [node1]
start etcd failed: Failed to exec command: sudo -E /bin
/bash -c "systemctl daemon-reload && systemctl restart
etcd && systemctl enable etcd"
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.: Process exited with status 1
23:09:47 CST retry: [node1]
23:09:47 CST stdout: [node3]
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.
23:09:47 CST message: [node3]
start etcd failed: Failed to exec command: sudo -E /bin
/bash -c "systemctl daemon-reload && systemctl restart
etcd && systemctl enable etcd"
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.: Process exited with status 1
23:09:47 CST retry: [node3]
23:11:23 CST stdout: [node2]
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.
23:11:23 CST message: [node2]
start etcd failed: Failed to exec command: sudo -E /bin
/bash -c "systemctl daemon-reload && systemctl restart
etcd && systemctl enable etcd"
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.: Process exited with status 1
23:11:23 CST stdout: [node1]
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.
23:11:23 CST message: [node1]
start etcd failed: Failed to exec command: sudo -E /bin
/bash -c "systemctl daemon-reload && systemctl restart
etcd && systemctl enable etcd"
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.: Process exited with status 1
23:11:23 CST stdout: [node3]
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.
23:11:23 CST message: [node3]
start etcd failed: Failed to exec command: sudo -E /bin
/bash -c "systemctl daemon-reload && systemctl restart
etcd && systemctl enable etcd"
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
l -xe" for details.: Process exited with status 1
23:11:23 CST failed: [node2]
23:11:23 CST failed: [node1]
23:11:23 CST failed: [node3]
error: Pipeline[CreateClusterPipeline] execute failed:
Module[ETCDConfigureModule] exec failed:
failed: [node2] [RestartETCD] exec failed after 3 ret
ires: start etcd failed: Failed to exec command: sudo -
E /bin/bash -c "systemctl daemon-reload && systemctl re
start etcd && systemctl enable etcd"
Job for etcd.service failed because a timeout was excee
ded. See "systemctl status etcd.service" and "journalct
Job for etcd.service failed because a timeout was exceeded. See "systemctl status etcd.service" and "journalctl -xe" for details.: Process exited with status 1
23:11:23 CST failed: [node2]
23:11:23 CST failed: [node1]
23:11:23 CST failed: [node3]
error: Pipeline[CreateClusterPipeline] execute failed: Module[ETCDConfigureModule] exec failed:
failed: [node2] [RestartETCD] exec failed after 3 retires: start etcd failed: Failed to exec command: sudo -E /bin/bash -c "systemctl daemon-reload && systemctl restart etcd && systemctl enable etcd"
Job for etcd.service failed because a timeout was exceeded. See "systemctl status etcd.service" and "journalctl -xe" for details.: Process exited with status 1
failed: [node1] [RestartETCD] exec failed after 3 retires: start etcd failed: Failed to exec command: sudo -E /bin/bash -c "systemctl daemon-reload && systemctl restart etcd && systemctl enable etcd"
Job for etcd.service failed because a timeout was exceeded. See "systemctl status etcd.service" and "journalctl -xe" for details.: Process exited with status 1
failed: [node3] [RestartETCD] exec failed after 3 retires: start etcd failed: Failed to exec command: sudo -E /bin/bash -c "systemctl daemon-reload && systemctl restart etcd && systemctl enable etcd"
Job for etcd.service failed because a timeout was exceeded. See "systemctl status etcd.service" and "journalctl -xe" for details.: Process exited with status 1
node1-3是etcd节点 执行journalctl -xe 显示内容如下
etcd[40808]: rejected connection from "100.118.47.12:38784"
etcd[40808]: rejected connection from "100.118.6.168:34562"
etcd[40808]: rejected connection from "100.118.6.168:34570"
etcd[40808]: rejected connection from "100.118.6.168:34574"
etcd[40808]: rejected connection from "100.118.47.12:38800"
etcd[40808]: rejected connection from "100.118.47.12:38802"
etcd[40808]: rejected connection from "100.118.6.168:34582"
节点之间的网络是通的