kubernetes v1.22.12 –with-kubesphere v3.4.1。kk在线安装的。
在监控告警里面,集群状态。
etcd黄色标志,是不是代表有异常。
应该如何排查和处理,谢谢。
kubernetes v1.22.12 –with-kubesphere v3.4.1。kk在线安装的。
在监控告警里面,集群状态。
etcd黄色标志,是不是代表有异常。
应该如何排查和处理,谢谢。
使用 etcdctl 工具检查集群健康状况
cat /etc/etcd.env
ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 --cacert=/etc/ssl/etcd/ssl/ca.crt --cert=/etc/ssl/etcd/ssl/server.crt --key=/etc/ssl/etcd/ssl/server.key endpoint health
redscholar
root@kk-master1-41:/data# cat /etc/etcd.env
# Environment file for etcd v3.5.13
ETCD_DATA_DIR=/var/lib/etcd
ETCD_ADVERTISE_CLIENT_URLS=https://172.16.78.41:2379
ETCD_INITIAL_ADVERTISE_PEER_URLS=https://172.16.78.41:2380
ETCD_INITIAL_CLUSTER_STATE=existing
ETCD_METRICS=basic
ETCD_LISTEN_CLIENT_URLS=https://172.16.78.41:2379,https://127.0.0.1:2379
ETCD_INITIAL_CLUSTER_TOKEN=k8s_etcd
ETCD_LISTEN_PEER_URLS=https://172.16.78.41:2380
ETCD_NAME=etcd-kk-master1-41
ETCD_PROXY=off
ETCD_ENABLE_V2=true
ETCD_INITIAL_CLUSTER=etcd-kk-master1-41=https://172.16.78.41:2380,etcd-kk-master2-42=https://172.16.78.42:2380,etcd-kk-master3-43=https://172.16.78.43:2380
ETCD_ELECTION_TIMEOUT=5000
ETCD_HEARTBEAT_INTERVAL=250
ETCD_AUTO_COMPACTION_RETENTION=8
ETCD_SNAPSHOT_COUNT=10000
# TLS settings
ETCD_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem
ETCD_CERT_FILE=/etc/ssl/etcd/ssl/member-kk-master1-41.pem
ETCD_KEY_FILE=/etc/ssl/etcd/ssl/member-kk-master1-41-key.pem
ETCD_CLIENT_CERT_AUTH=true
ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem
ETCD_PEER_CERT_FILE=/etc/ssl/etcd/ssl/member-kk-master1-41.pem
ETCD_PEER_KEY_FILE=/etc/ssl/etcd/ssl/member-kk-master1-41-key.pem
ETCD_PEER_CLIENT_CERT_AUTH=true
# CLI settings
ETCDCTL_ENDPOINTS=https://127.0.0.1:2379
ETCDCTL_CACERT=/etc/ssl/etcd/ssl/ca.pem
ETCDCTL_KEY=/etc/ssl/etcd/ssl/admin-kk-master1-41-key.pem
ETCDCTL_CERT=/etc/ssl/etcd/ssl/admin-kk-master1-41.pem
root@kk-master1-41:/data#
root@kk-master1-41:/data#
root@kk-master1-41:/data# ETCDCTL_API=3 etcdctl –endpoints=http://127.0.0.1:2379 –cacert=/etc/ssl/etcd/ssl/ca.crt –cert=/etc/ssl/etcd/ssl/server.crt –key=/etc/ssl/etcd/ssl/server.key endpoint health
Error: open /etc/ssl/etcd/ssl/server.crt: no such file or directory
root@kk-master1-41:/data#
提示找不到
Error: open /etc/ssl/etcd/ssl/server.crt: no such file or directory
是什么原因呢,谢谢。
你的/etc/etcd.env配置文件里面证书是
# CLI settings
ETCDCTL_ENDPOINTS=https://127.0.0.1:2379
ETCDCTL_CACERT=/etc/ssl/etcd/ssl/ca.pem
ETCDCTL_KEY=/etc/ssl/etcd/ssl/admin-kk-master1-41-key.pem
ETCDCTL_CERT=/etc/ssl/etcd/ssl/admin-kk-master1-41.pem
把命令中的证书路径替换成这个
redscholar
root@kk-master1-41:/# ETCDCTL_API=3 etcdctl –endpoints=http://127.0.0.1:2379 –cacert=/etc/ssl/etcd/ssl/ca.pem –cert=/etc/ssl/etcd/ssl/admin-kk-master1-41.pem –key=/etc/ssl/etcd/ssl/admin-kk-master1-41-key.pem endpoint health
{“level”:“warn”,“ts”:“2025-05-20T12:23:56.447465+0800”,“logger”:“client”,“caller”:“v3@v3.5.13/retry_interceptor.go:62”,“msg”:“retrying of unary invoker failed”,“target”:“etcd-endpoints://0xc00015e000/127.0.0.1:2379”,“attempt”:0,“error”:"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \“error reading server preface: read tcp 127.0.0.1:47868->127.0.0.1:2379: read: connection reset by peer\”"}
http://127.0.0.1:2379 is unhealthy: failed to commit proposal: context deadline exceeded
Error: unhealthy cluster
root@kk-master1-41:/#
现在提示这个错误。
谢谢指导
root@kk-master1-41:/var/lib/etcd/member# ETCDCTL_API=3 etcdctl –endpoints=https://127.0.0.1:2379 \
–cacert=/etc/ssl/etcd/ssl/ca.pem \
–cert=/etc/ssl/etcd/ssl/admin-kk-master1-41.pem \
–key=/etc/ssl/etcd/ssl/admin-kk-master1-41-key.pem \
endpoint health
https://127.0.0.1:2379 is healthy: successfully committed proposal: took = 16.209597ms
root@kk-master1-41:/var/lib/etcd/member#
我用上面的这个命令,就可以访问,是不是要https才可以啊,我应该如何修改呢,谢谢。
两个问题,第一个,你给我的代码,两个横杠,被过滤了一个,所以,有问题,第二个问题,http不可以,https就可以了。
但是通过kubesphere后台看到etcd还是黄色的,应该是不正常的状态。
yay 三个节点都看一下
redscholar
https://127.0.0.1:2379 is healthy: successfully committed proposal: took = 14.778182ms
三个节点都提示这个,
root@kk-master1-41:/var/lib/etcd/member# ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/etc/ssl/etcd/ssl/ca.pem \
--cert=/etc/ssl/etcd/ssl/admin-kk-master1-41.pem \
--key=/etc/ssl/etcd/ssl/admin-kk-master1-41-key.pem \
endpoint health
https://127.0.0.1:2379 is healthy: successfully committed proposal: took = 14.778182ms
root@kk-master1-41:/var/lib/etcd/member#
root@kk-master2-42:/# ETCDCTL_API=3 etcdctl –endpoints=https://127.0.0.1:2379 \
–cacert=/etc/ssl/etcd/ssl/ca.pem \
–cert=/etc/ssl/etcd/ssl/admin-kk-master2-42.pem \
–key=/etc/ssl/etcd/ssl/admin-kk-master2-42-key.pem \
endpoint health
https://127.0.0.1:2379 is healthy: successfully committed proposal: took = 5.56972ms
root@kk-master2-42:/#
root@kk-master3-43:/# ETCDCTL_API=3 etcdctl –endpoints=https://127.0.0.1:2379 \
–cacert=/etc/ssl/etcd/ssl/ca.pem \
–cert=/etc/ssl/etcd/ssl/admin-kk-master3-43.pem \
–key=/etc/ssl/etcd/ssl/admin-kk-master3-43-key.pem \
endpoint health
https://127.0.0.1:2379 is healthy: successfully committed proposal: took = 15.389317ms
root@kk-master3-43:/#
你的etcd集群是好的。外部etcd默认是不会被监控的。
你可以在kubesphere-monitoring-system 命名空间下创建一个名为etcd 的 servicemonitors 来监控外部etcd
或者启用cc中的etcd监控
https://github.com/kubesphere/ks-installer/blob/cdd3d677f2bbdcd8c97186bf569cbc9c1deda82d/deploy/cluster-configuration.yaml#L17-L21
redscholar 你的意思是,默认kubesphere不会监控用过kk默认安装的etcd吗?
我这样子的配置,他也不会监控啊。
如何kk安装etcd和传统安装一样,放在kube-system空间里面,而不是以系统服务的形式出现呢?
yay 把monitoring改成true, endpointIps改成实际ip