安装正常使用一段时间后，登录提示获取ks-apiserver token失败，后台重建部分pod后。可以登录，但是界面报500 internal error。

lldhsds

创建部署问题时，请参考下面模板，你提供的信息越多，越容易及时获得解答。如果未按模板创建问题，管理员有权关闭问题。
确保帖子格式清晰易读，用 markdown code block 语法格式化代码块。
你只花一分钟创建的问题，不能指望别人花上半个小时给你解答。

操作系统信息
虚拟机，Centos7.9，8C16G

Kubernetes版本信息

[root@kubesphere1 ~]# kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.15", GitCommit:"b84cb8ab29366daa1bba65bc67f54de2f6c34848", GitTreeState:"clean", BuildDate:"2022-12-08T10:49:13Z", GoVersion:"go1.17.13", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.15", GitCommit:"b84cb8ab29366daa1bba65bc67f54de2f6c34848", GitTreeState:"clean", BuildDate:"2022-12-08T10:42:57Z", GoVersion:"go1.17.13", Compiler:"gc", Platform:"linux/amd64"}

容器运行时
将 docker version / crictl version / nerdctl version 结果贴在下方

KubeSphere版本信息
KubeSphere 版本 : v3.4.1

离线安装，在linux上使用kk安装。

问题是什么
报错1：failed, reason: getaddrinfo EAI_AGAIN ks-apiserver.kubesphere-system

搜索类似问题，大多数原因没有coredns导致console连接apiserver异常，检查coredns状态均正常。

报错2：console web界面提示”500 Internal Server Error“

请求 URL:

http://10.210.10.231:30880/kapis/resources.kubesphere.io/v1alpha3/namespaces?labelSelector=%21kubesphere.io%2Fdevopsproject&sortBy=createTime&limit=10

ks-apiserver报错：error: dial tcp: lookup redis.kubesphere-system.svc: i/o timeout"

ks-controller-manager报错：LDAP Result Code 200 “Network Error”: dial tcp: lookup openldap.kubesphere-system.svc on 169.254.25.10:53: read

报错2、3为节点kubesphere3上面的pod错误，节点kubesphere1、2上面正常。

下面为部分操作记录：

# 环境信息
[root@kubesphere1 ~]# kubectl get node -o wide
NAME          STATUS   ROLES                         AGE     VERSION    INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION           CONTAINER-RUNTIME
kubesphere1   Ready    control-plane,master,worker   6d19h   v1.23.15   10.210.10.231   <none>        CentOS Linux 7 (Core)   3.10.0-1160.el7.x86_64   docker://24.0.6
kubesphere2   Ready    control-plane,master,worker   6d19h   v1.23.15   10.210.10.232   <none>        CentOS Linux 7 (Core)   3.10.0-1160.el7.x86_64   docker://24.0.6
kubesphere3   Ready    control-plane,master,worker   6d19h   v1.23.15   10.210.10.233   <none>        CentOS Linux 7 (Core)   3.10.0-1160.el7.x86_64   docker://24.0.6
[root@kubesphere1 ~]# kubectl top node
NAME          CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
kubesphere1   664m         8%     9311Mi          63%
kubesphere2   728m         9%     9423Mi          64%
kubesphere3   680m         8%     6391Mi          43%


# kubesphere3节点的ks-apiserver和ks-controller-manager状态异常，日志显示连接redis和ldap超时。
[root@kubesphere1 ~]# kubectl get pod -n kubesphere-system -o wide
NAME                                     READY   STATUS             RESTARTS        AGE     IP               NODE          NOMINATED NODE   READINESS GATES
ks-apiserver-6f65d5c8b-cb8pc             1/1     Running            7 (14h ago)     14h     10.233.124.92    kubesphere2   <none>           <none>
ks-apiserver-6f65d5c8b-g2748             1/1     Running            20 (14h ago)    6d18h   10.233.107.141   kubesphere1   <none>           <none>
ks-apiserver-6f65d5c8b-gtgtt             0/1     CrashLoopBackOff   7 (2m2s ago)    14m     10.233.76.136    kubesphere3   <none>           <none>
ks-console-844747bfd6-6rgbg              1/1     Running            4 (14h ago)     6d18h   10.233.107.140   kubesphere1   <none>           <none>
ks-console-844747bfd6-72xns              1/1     Running            4 (14h ago)     6d18h   10.233.76.110    kubesphere3   <none>           <none>
ks-console-844747bfd6-xq6nd              1/1     Running            2 (14h ago)     5d19h   10.233.124.58    kubesphere2   <none>           <none>
ks-controller-manager-56758c7878-6dn6v   1/1     Running            16 (14h ago)    6d18h   10.233.107.127   kubesphere1   <none>           <none>
ks-controller-manager-56758c7878-plt7d   0/1     CrashLoopBackOff   6 (3m22s ago)   14m     10.233.76.135    kubesphere3   <none>           <none>
ks-controller-manager-56758c7878-stxcf   1/1     Running            6 (14h ago)     14h     10.233.124.85    kubesphere2   <none>           <none>
ks-installer-b7b88cb58-67828             1/1     Running            5 (14h ago)     6d18h   10.233.107.132   kubesphere1   <none>           <none>
minio-676f77b998-47jk9                   1/1     Running            2 (14h ago)     5d19h   10.233.124.69    kubesphere2   <none>           <none>
openldap-0                               1/1     Running            5 (14h ago)     6d12h   10.233.76.111    kubesphere3   <none>           <none>
openldap-1                               1/1     Running            5 (14h ago)     6d12h   10.233.107.145   kubesphere1   <none>           <none>
openpitrix-import-job-kkrmk              0/1     Completed          0               14h     10.233.124.95    kubesphere2   <none>           <none>
redis-54b56679bd-5l7p8                   1/1     Running            2 (14h ago)     5d19h   10.233.124.68    kubesphere2   <none>           <none>

[root@kubesphere1 ~]# kubectl logs -fn kubesphere-system ks-apiserver-6f65d5c8b-gtgtt
W0821 04:36:20.187645       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
W0821 04:36:20.189389       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
E0821 04:36:25.198973       1 cache.go:69] failed to create cache, error: dial tcp: lookup redis.kubesphere-system.svc: i/o timeout
E0821 04:36:25.199042       1 run.go:74] "command failed" err="failed to create cache, error: dial tcp: lookup redis.kubesphere-system.svc: i/o timeout"
[root@kubesphere1 ~]# kubectl logs -fn kubesphere-system ks-apiserver ks-controller-manager-56758c7878-plt7d
Error from server (NotFound): pods "ks-apiserver" not found
[root@kubesphere1 ~]# kubectl logs -fn kubesphere-system  ks-controller-manager-56758c7878-plt7d
W0821 04:34:28.181726       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0821 04:34:28.183376       1 server.go:197] setting up manager
I0821 04:34:28.227481       1 listener.go:44] "controller-runtime/metrics: Metrics server is starting to listen" addr=":8080"
F0821 04:35:05.244700       1 server.go:219] unable to register controllers to the manager: failed to connect to ldap service, please check ldap status, er                      ror: factory is not able to fill the pool: LDAP Result Code 200 "Network Error": dial tcp: lookup openldap.kubesphere-system.svc on 169.254.25.10:53: read                       udp 10.233.76.135:42640->169.254.25.10:53: i/o timeout

# 测试同在节点kubesphere3的ks-console无法和redis、openldap通信
[root@kubesphere1 ~]# kubectl exec -itn kubesphere-system ks-console-844747bfd6-72xns sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.

/opt/kubesphere/console $ ping  openldap.kubesphere-system.svc
^C
/opt/kubesphere/console $ ping redis.kubesphere-system.svc
^C
/opt/kubesphere/console $ cat /etc/resolv.conf
nameserver 169.254.25.10
search kubesphere-system.svc.cluster.local svc.cluster.local cluster.local test.com
options ndots:5
/opt/kubesphere/console $ ping 169.254.25.10
PING 169.254.25.10 (169.254.25.10): 56 data bytes
64 bytes from 169.254.25.10: seq=0 ttl=42 time=0.091 ms
^C
--- 169.254.25.10 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.091/0.091/0.091 ms
/opt/kubesphere/console $ exit

# redis、openldap、coredns状态正常
[root@kubesphere1 ~]# kubectl get pod -A | grep "redis\|openldap\|coredns"
argocd                         devops-argocd-redis-99c6d77c5-k4cs5                            1/1     Running            2 (15h ago)      6d13h
kube-system                    coredns-86688d9f48-cxhcl                                       1/1     Running            0                15h
kube-system                    coredns-86688d9f48-xb6jf                                       1/1     Running            0                15h
kubesphere-system              openldap-0                                                     1/1     Running            5 (15h ago)      6d13h
kubesphere-system              openldap-1                                                     1/1     Running            5 (15h ago)      6d13h
kubesphere-system              redis-54b56679bd-5l7p8                                         1/1     Running            2 (15h ago)      5d20h


# 测试其他节点pod可以和redis通信
[root@kubesphere1 ~]# kubectl exec -itn kubesphere-system ks-console-844747bfd6-6rgbg sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/opt/kubesphere/console $ ping redis.kubesphere-system.svc
PING redis.kubesphere-system.svc (10.233.10.14): 56 data bytes
64 bytes from 10.233.10.14: seq=0 ttl=42 time=0.077 ms
^C
--- redis.kubesphere-system.svc ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.077/0.077/0.077 ms
/opt/kubesphere/console $ exit

# 检查calico状态正常
[root@kubesphere1 ~]# kubectl get pod -A | grep calico
kube-system                    calico-kube-controllers-7f5795c4bc-c8bhp                       1/1     Running            4 (14h ago)       6d18h
kube-system                    calico-node-4p9ls                                              1/1     Running            0                 14h
kube-system                    calico-node-g7srh                                              1/1     Running            4 (14h ago)       6d18h
kube-system                    calico-node-sczkt                                              1/1     Running            0                 14h
[root@kubesphere1 ~]# kubectl get pod -A -o wide| grep calico
kube-system                    calico-kube-controllers-7f5795c4bc-c8bhp                       1/1     Running            4 (14h ago)       6d18h   10.233.1                      07.128   kubesphere1   <none>           <none>
kube-system                    calico-node-4p9ls                                              1/1     Running            0                 14h     10.210.1                      0.233    kubesphere3   <none>           <none>
kube-system                    calico-node-g7srh                                              1/1     Running            4 (14h ago)       6d18h   10.210.1                      0.231    kubesphere1   <none>           <none>
kube-system                    calico-node-sczkt                                              1/1     Running            0                 14h     10.210.1                      0.232    kubesphere2   <none>           <none>
[root@kubesphere1 ~]# kubectl get pod -A -o wide| grep calico
kube-system                    calico-kube-controllers-7f5795c4bc-c8bhp                       1/1     Running            4 (14h ago)       6d18h   10.233.107.128   kubesphere1
kube-system                    calico-node-4p9ls                                              1/1     Running            0                 14h     10.210.10.233    kubesphere3
kube-system                    calico-node-g7srh                                              1/1     Running            4 (14h ago)       6d18h   10.210.10.231    kubesphere1
kube-system                    calico-node-sczkt                                              1/1     Running            0                 14h     10.210.10.232    kubesphere2
# 重建kubespere3节点的calico pod没有恢复
[root@kubesphere1 ~]# kubectl delete pod -n kube-system calico-node-4p9ls
pod "calico-node-4p9ls" deleted
[root@kubesphere1 ~]# kubectl exec -itn kubesphere-system ks-console-844747bfd6-72xns sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/opt/kubesphere/console $ ping  redis.kubesphere-system.svc
ping: bad address 'redis.kubesphere-system.svc'

lldhsds

calico状态检查正常：

[root@kubesphere1 ~]# calicoctl node status
Calico process is running.

IPv4 BGP status
+---------------+-------------------+-------+----------+-------------+
| PEER ADDRESS  |     PEER TYPE     | STATE |  SINCE   |    INFO     |
+---------------+-------------------+-------+----------+-------------+
| 10.210.10.232 | node-to-node mesh | up    | 17:56:08 | Established |
| 10.210.10.233 | node-to-node mesh | up    | 08:41:55 | Established |
+---------------+-------------------+-------+----------+-------------+

IPv6 BGP status
No IPv6 peers found.

[root@kubesphere1 ~]# calicoctl get nodes
NAME
kubesphere1
kubesphere2
kubesphere3

lldhsds

从日志看是dns解析问题，查了下coredns运行在Kubesphere1、2节点上。把coredns扩容为3副本，目前问题解决没有再出现。

[root@kubesphere1 ~]# kubectl get pod -A -o wide | grep coredns
kube-system                    coredns-86688d9f48-cxhcl                                       1/1     Running            0                4d16h   10.233.107.146   kubesphere1   <none>           <none>
kube-system                    coredns-86688d9f48-xb6jf                                       1/1     Running            0                4d16h   10.233.124.84    kubesphere2   <none>           <none>

[root@kubesphere1 ~]# kubectl get deploy -A -o wide | grep dns
kube-system                    daemonset.apps/nodelocaldns             3         3         3       3            3           <none>                   10d   node-cache                      10.210.10.210/kubesphereio/k8s-dns-node-cache:1.15.12                                      k8s-app=nodelocaldns
kube-system                    deployment.apps/coredns                                   2/2     2            2           10d   coredns                                                        10.210.10.210/kubesphereio/coredns:1.8.6                                                                                                               k8s-app=kube-dns

[root@kubesphere1 ~]# kubectl scale -n kube-system deployment.apps/coredns --replicas=3
deployment.apps/coredns scaled
[root@kubesphere1 ~]# kubectl get pod -A -o wide | grep coredns
kube-system                    pod/coredns-86688d9f48-cxhcl                                       1/1     Running            0                4d16h   10.233.107.146   kubesphere1   <none>           <none>
kube-system                    pod/coredns-86688d9f48-jljnj                                       1/1     Running            0                3s      10.233.76.168    kubesphere3   <none>           <none>
kube-system                    pod/coredns-86688d9f48-xb6jf                                       1/1     Running            0                4d16h   10.233.124.84    kubesphere2   <none>           <none>

[root@kubesphere1 ~]# kubectl get pod -n kube-system
NAME                                         READY   STATUS    RESTARTS         AGE
calico-kube-controllers-7f5795c4bc-c8bhp     1/1     Running   6 (3d20h ago)    10d
calico-node-55cbp                            1/1     Running   1 (4m21s ago)    4d1h
calico-node-g7srh                            1/1     Running   4 (4d16h ago)    10d
calico-node-sczkt                            1/1     Running   0                4d16h
coredns-86688d9f48-cxhcl                     1/1     Running   0                4d16h
coredns-86688d9f48-jljnj                     1/1     Running   0                100s
coredns-86688d9f48-xb6jf                     1/1     Running   0                4d16h
kube-apiserver-kubesphere1                   1/1     Running   6 (3d20h ago)    10d
kube-apiserver-kubesphere2                   1/1     Running   5 (4d16h ago)    10d
kube-apiserver-kubesphere3                   1/1     Running   6 (4m21s ago)    10d
kube-controller-manager-kubesphere1          1/1     Running   8 (3d20h ago)    10d
kube-controller-manager-kubesphere2          1/1     Running   9 (30m ago)      10d
kube-controller-manager-kubesphere3          1/1     Running   8 (4m21s ago)    10d
kube-proxy-5t7qc                             1/1     Running   4 (4d16h ago)    10d
kube-proxy-bd45l                             1/1     Running   5 (4m21s ago)    10d
kube-proxy-m6w2q                             1/1     Running   4 (4d16h ago)    10d
kube-scheduler-kubesphere1                   1/1     Running   10 (29m ago)     10d
kube-scheduler-kubesphere2                   1/1     Running   6 (31m ago)      10d
kube-scheduler-kubesphere3                   1/1     Running   12 (4m21s ago)   10d
metrics-server-869f9c4cf9-9blvv              1/1     Running   8 (3d20h ago)    9d
nodelocaldns-4w296                           1/1     Running   5 (4m11s ago)    10d
nodelocaldns-dw7zt                           1/1     Running   4 (4d16h ago)    10d
nodelocaldns-v65h9                           1/1     Running   4 (4d16h ago)    10d
openebs-localpv-provisioner-c9f5cf88-tzfj8   1/1     Running   24 (30m ago)     10d
snapshot-controller-0                        1/1     Running   5 (4m21s ago)    10d