guoh1988

error: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request

出现这个错误,说明有apiservice有问题,你执行 kubectl get apiservice 确认 metrics 服务是正常的

root@master1:~# kubectl get apiservice | grep metrics
v1beta1.metrics.k8s.io                 kube-system/metrics-server   True        147d

如果第三栏不是True,说明metrics服务有问题,k8s的服务发现机制要求所有apiservice都是True状态

    Jeff 谢谢你的回复,这边看kubectl get apiservice | grep metrics 失败状态

    v1beta1.metrics.k8s.io                 kube-system/metrics-server   False (FailedDiscoveryCheck)   22d

    上面我日志中已经有个metrics-server起来但是kube-apiserver还是去试图连接那个丢失节点上的pod 10.244.235.186当我kill kube-apiserver之后他就能正确连接10.244.180.55
    早期使用udp出现过这样的错误,但是使用conntrack -D 清除连接缓存就可以生成新的连接,但是这次即使使用了conntrack -D 还是不行,还是会重新指向错误的podIP 10.244.235.186 感觉有地方记忆了
    https://github.com/kubernetes/kubernetes/issues/59368?from=singlemessage

    metrics-server-8b7689b66-xm6mf            1/1     Running    0          36s     10.244.180.55    master2   <none>           <none>
    metrics-server-8b7689b66-z9hk9            1/1     Unknown    0          3m58s   10.244.235.186  worker1   <none>           <none>
    conntrack -L | grep 10.101.186.48
    tcp      6 278 ESTABLISHED src=10.101.186.48 dst=10.101.186.48 sport=45842 dport=443 src=10.244.235.186 dst=192.168.210.71 sport=443 dport=19158 [ASSURED] mark=0 use=1
    tcp      6 298 ESTABLISHED src=10.101.186.48 dst=10.101.186.48 sport=45820 dport=443 src=10.244.235.186 dst=192.168.210.71 sport=443 dport=15276 [ASSURED] mark=0 use=2

    我试了如下的方法是有效的,修改了内核从net.ipv4.tcp_retries2=15到net.ipv4.tcp_retries2=1,断电的情况下载1min释放之后可以指定到10.244.180.55

    https://blog.csdn.net/gao1738/article/details/42839697

    • Jeff 回复了此帖

      guoh1988 如果是生产环境,不建议这么设置,建议你可以ipvs规则是否在master节点上没有更新

        Jeff 我这边使用的就是ipvs模式,不知道你说的IPvs规则指的是不是kube-prox的ipvs模式,上边的说明udp bug跟这次还是有区别,UDP只要使用conntrack -D 清除连接缓存就可以生成新的连接,kube-apserver 链接metrics-server 即使conntrack -D 清除重新生成还是错误的,也不能解决,只能修改内核net.ipv4.tcp_retries2或者重启kube-apiserver

        kubectl logs -f kube-proxy-7wkbr  -n kube-system
        I1123 01:40:45.112593       1 node.go:135] Successfully retrieved node IP: 192.168.210.71
        I1123 01:40:45.112639       1 server_others.go:177] Using ipvs Proxier.
        W1123 01:40:45.112929       1 proxier.go:415] IPVS scheduler not specified, use rr by default
        I1123 01:40:45.113153       1 server.go:529] Version: v1.16.10
        I1123 01:40:45.113560       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
        I1123 01:40:45.113585       1 conntrack.go:52] Setting nf_conntrack_max to 131072
        I1123 01:40:45.113628       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
        I1123 01:40:45.113650       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
        I1123 01:40:45.115833       1 config.go:131] Starting endpoints config controller
        I1123 01:40:45.115878       1 config.go:313] Starting service config controller
        I1123 01:40:45.115894       1 shared_informer.go:197] Waiting for caches to sync for endpoints config
        I1123 01:40:45.115897       1 shared_informer.go:197] Waiting for caches to sync for service config
        I1123 01:40:45.216030       1 shared_informer.go:204] Caches are synced for endpoints config 
        I1123 01:40:45.216033       1 shared_informer.go:204] Caches are synced for service config 
        • Jeff 回复了此帖

          guoh1988 是的,出现pod重启,apiservice仍然没有正常的情况下,看下master节点上ipvs规则,ipvsadm -Ln 看下metrics-server ip对应的pod ip是否正确。ipvsadm -lnc 看下是否有很多sync_wait的链接

          谢谢你的回复
          正常情况下

          kubectl get pods -n kube-system -o wide
          NAME                                      READY   STATUS    RESTARTS   AGE    IP               NODE      NOMINATED NODE   READINESS GATES
          metrics-server-8b7689b66-rvrrl            1/1     Running   0          8m7s   10.244.235.137   worker1   <none>           <none>
          
          ipvsadm -Ln
          TCP  10.101.186.48:443 rr
            -> 10.244.235.137:443           Masq    1      2          0   
          
          ipvsadm -lnc  | grep 10.101.186.48
          TCP 14:37  ESTABLISHED 10.101.186.48:56312 10.101.186.48:443  10.244.235.137:443
          TCP 14:55  ESTABLISHED 10.101.186.48:56328 10.101.186.48:443  10.244.235.137:443

          异常情况下

          ipvsadm -Ln
          TCP  10.101.186.48:443 rr
            -> 10.244.180.39:443            Masq    1      0          0         
            -> 10.244.235.137:443           Masq    0      2          0 

          一个瞬间转入CLOSE_WAIT,还有一个在ESTABLISHED ,看前面的时间和15min吻合

          ipvsadm -lnc  | grep 10.101.186.48
          TCP 14:58  ESTABLISHED 10.101.186.48:56312 10.101.186.48:443  10.244.235.137:443
          TCP 00:18  CLOSE_WAIT  10.101.186.48:56328 10.101.186.48:443  10.244.235.137:443

          还有一个规则一直在等待

          ipvsadm -lnc  | grep 10.101.186.48
          TCP 14:44  ESTABLISHED 10.101.186.48:56312 10.101.186.48:443  10.244.235.137:443
          • Jeff 回复了此帖

            guoh1988

            ipvsadm -Ln
            TCP  10.101.186.48:443 rr
              -> 10.244.180.39:443            Masq    1      0          0         
            -> 10.244.235.137:443 Masq 0 2 0 ``` 这个规则是由kube-proxy来控制的,你查下你的环境,是不是kube-proxy刷新的不够及时
            ipvsadm -Ln
            TCP  10.101.186.48:443 rr
              -> 10.244.180.39:443            Masq    1      0          0         
            
              -> 10.244.235.137:443           Masq    0      2          0 

            这边权重已经变成0了,应该不会轮训上去,刚才我试了下出问题时候kill kube-paiserver都能连接到10.244.180.39:443
            下面是我的kube-proxy的配置


              apiVersion: kubeproxy.config.k8s.io/v1alpha1
            bindAddress: 0.0.0.0
            clientConnection:
              acceptContentTypes: ""
              burst: 10
              contentType: application/vnd.kubernetes.protobuf
              kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
              qps: 5
            clusterCIDR: 10.244.0.0/16
            configSyncPeriod: 15m0s
            conntrack:
              maxPerCore: 32768
              min: 131072
              tcpCloseWaitTimeout: 1h0m0s
              tcpEstablishedTimeout: 24h0m0s
            enableProfiling: false
            healthzBindAddress: 0.0.0.0:10256
            hostnameOverride: ""
            iptables:
              masqueradeAll: false
              masqueradeBit: 14
              minSyncPeriod: 0s
              syncPeriod: 30s
            ipvs:
              excludeCIDRs: null
              minSyncPeriod: 0s
              scheduler: ""
              strictARP: true
              syncPeriod: 30s
            kind: KubeProxyConfiguration
            metricsBindAddress: 127.0.0.1:10249
            mode: ipvs
            nodePortAddresses: null
            oomScoreAdj: -999
            portRange: ""
            udpIdleTimeout: 250ms
            winkernel:
              enableDSR: false
              networkName: ""
              sourceVip: ""

            问下你手边有k8s+kubeshere集群吗?这边这个很好模拟,查看metrics-server所在节点,直接将所在节点关闭电源,我这边使用的VMware

            • Jeff 回复了此帖

              guoh1988 直接关机是可能出现这个问题,类似的问题我们之前遇到过,一般因为节点宕机导致pod需要迁移,会有默认5分钟的间隔时间。针对metrics-server,你可以把这个时间调小点

                Jeff 已经调小了,我是调整到30S, 5min是丢失节点pod重建的时间,我这边的metrics-server pod已经重建了,并不是因为pod没有建立的原因,这个等待需要15min

                10 天 后

                guoh1988 这个是K8s的问题了,你可以到k8s issue里搜下有无相关的问题