操作系统信息,例如:虚拟机/物理机,Cento8.4,16C/64G
Kubernetes版本信息 :v1.20.4 。多节点。(3个master+N个node)
KubeSphere版本信息 : v3.1.1。在线安装。
问题是什么 : 添加节点时无法继续执行
# 执行的命令,之前有执行过单节点加入,这次再加入新节点,结果卡在其中一台机器无法执行下去,不知道怎么排查
./kk add nodes -f whbj-update.yaml
# 报错信息如下
INFO[09:17:22 CST] **Configuring operating system ...**
.......
......
......
net.ipv4.ip_local_reserved_ports = 30000-32767
vm.max_map_count = 262144
vm.swappiness = 1
fs.inotify.max_user_instances = 524288
ERRO[09:03:40 CST] Failed to override hostname: Failed to exec command: sudo -E /bin/sh -c "hostnamectl set-hostname whbj002 && sed -i '/^127.0.1.1/s/.*/127.0.1.1 whbj002/g' /etc/hosts"
Could not set property: Failed to activate service 'org.freedesktop.hostname1': timed out (service_start_timeout=25000ms): Process exited with status 1 node=172.18.44.25
WARN[09:03:40 CST] Task failed ...
WARN[09:03:40 CST] error: interrupted by error
Error: Failed to init OS: interrupted by error
Usage:
kk add nodes [flags]
ps:发生问题的节点 whbj002 ,就是刚好我执行这个命令的节点….

whbj002 这个节点上我查看了下相关命令执行时的效果,发现是 hostname 的相关服务无法启动
[root@whbj002 kubesphere]# systemctl status systemd-hostnamed
● systemd-hostnamed.service - Hostname Service
Loaded: loaded (/usr/lib/systemd/system/systemd-hostnamed.service; static; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2021-09-14 09:51:32 CST; 2min 26s ago
Docs: man:systemd-hostnamed.service(8)
man:hostname(5)
man:machine-info(5)
https://www.freedesktop.org/wiki/Software/systemd/hostnamed
Process: 1220905 ExecStart=/usr/lib/systemd/systemd-hostnamed (code=exited, status=226/NAMESPACE)
Main PID: 1220905 (code=exited, status=226/NAMESPACE)
目前进度:
github 找到相关问题的文章(很尴尬,其实没作用): kubernetes/minikube#727
ps: 由于该机器上还跑着一些服务,不能随意重启,所以还在思考中,有小伙伴知晓的,麻烦告知下,谢谢