我们有一个hosti集群和一个被纳管集群m,之前通过host都是可以直接访问集群m的。
今天突然发现通过host无法选择集群m的项目,点击集群m后会返回401,并且跳的login界面。

查看被纳管集群m上的ks-apiserver的log,有如下错误

E1218 20:40:10.012922       1 jwt_token.go:45] token not found in cache
E1218 20:40:10.012939       1 authentication.go:60] Unable to authenticate the request due to error: token not found in cache
E1218 20:40:35.047541       1 token.go:65] token not found in cache
E1218 20:40:35.047570       1 jwt_token.go:45] token not found in cache
E1218 20:40:35.047589       1 authentication.go:60] Unable to authenticate the request due to error: token not found in cache
E1218 20:40:45.012798       1 token.go:65] token not found in cache
E1218 20:40:45.012827       1 jwt_token.go:45] token not found in cache
E1218 20:40:45.012844       1 authentication.go:60] Unable to authenticate the request due to error: token not found in cache
E1218 20:40:46.044399       1 token.go:65] token not found in cache
E1218 20:40:46.044425       1 jwt_token.go:45] token not found in cache
E1218 20:40:46.044442       1 authentication.go:60] Unable to authenticate the request due to error: token not found in cache
E1218 20:40:55.019939       1 token.go:65] token not found in cache
E1218 20:40:55.019969       1 jwt_token.go:45] token not found in cache
E1218 20:40:55.019987       1 authentication.go:60] Unable to authenticate the request due to error: token not found in cache
E1218 20:41:05.013632       1 token.go:65] token not found in cache
E1218 20:41:05.013661       1 jwt_token.go:45] token not found in cache
E1218 20:41:05.013678       1 authentication.go:60] Unable to authenticate the request due to error: token not found in cache
E1218 20:41:10.045320       1 token.go:65] token not found in cache
E1218 20:41:10.045350       1 jwt_token.go:45] token not found in cache
E1218 20:41:10.045367       1 authentication.go:60] Unable to authenticate the request due to error: token not found in cache
    • [已注销]

    • 最佳回复qd19zzx 选择

    问题原因找到了更新一下
    m集群的kubesphere-config accessTokenMaxAge这里需要设置成0,查看源码0意味着永不超期。kubesphere-config可能有人误改成了10h,所以通过host无法登录m。
    问题虽然解决了但是逻辑还是不太清楚,10h的超期时间还不够吗?为什么m
    集群设置成10h超期,ks-apiserver会一直报token not found in cache的错误

    apiVersion: v1
    data:
      kubesphere.yaml: |
        authentication:
          authenticateRateLimiterMaxTries: 10
          authenticateRateLimiterDuration: 10m0s
          loginHistoryRetentionPeriod: 24h
          maximumClockSkew: 10s
          multipleLogin: true
          kubectlImage: kubespheredev/kubectl:v1.0.0
          jwtSecret: "XC14lrpYMKbWC30ut1UYdd2KYV0kQpqs"
          oauthOptions:
            accessTokenMaxAge: 0

host集群ks-apiserver上有token超时的错误

2020/12/18 19:50:02 http: proxy error: context canceled
2020/12/18 20:02:21 http: proxy error: context canceled
2020/12/18 20:12:29 http: proxy error: context canceled
2020/12/18 20:12:30 http: proxy error: context canceled
2020/12/18 20:12:30 http: proxy error: context canceled
2020/12/18 20:12:37 http: proxy error: context canceled
E1218 20:25:08.219883       1 ldap_provider.go:101] LDAP Result Code 49 "Invalid Credentials":
E1218 20:25:08.219999       1 authenticator.go:108] LDAP Result Code 49 "Invalid Credentials":
E1218 20:25:08.220015       1 handler.go:275] incorrect password
E1218 20:25:08.222138       1 login_recoder.go:75] LoginRecord.iam.kubesphere.io "A0041454-7tjl7" is invalid: [metadata.generateName: Invalid value: "A0041454-": a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'), metadata.name: Invalid value: "A0041454-7tjl7": a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')]
E1218 20:25:08.222163       1 handler.go:279] LoginRecord.iam.kubesphere.io "A0041454-7tjl7" is invalid: [metadata.generateName: Invalid value: "A0041454-": a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'), metadata.name: Invalid value: "A0041454-7tjl7": a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')]
I1218 20:25:08.222186       1 apiserver.go:539] 100.77.248.93 - "POST /oauth/token HTTP/1.1" 500 697 6ms
E1218 20:44:16.565604       1 ldap_provider.go:101] LDAP Result Code 49 "Invalid Credentials":
E1218 20:44:16.565727       1 authenticator.go:108] LDAP Result Code 49 "Invalid Credentials":
E1218 20:44:16.565743       1 handler.go:275] incorrect password
I1218 20:44:16.569795       1 apiserver.go:539] 100.77.248.93 - "POST /oauth/token HTTP/1.1" 401 32 31ms
2020/12/18 20:45:17 http: proxy error: context canceled
E1218 20:48:05.008286       1 jwt.go:51] token is expired by 5s
E1218 20:48:05.008315       1 token.go:57] token is expired by 5s
E1218 20:48:05.008326       1 jwt_token.go:45] token is expired by 5s
E1218 20:48:05.008338       1 authentication.go:60] Unable to authenticate the request due to error: token is expired by 5s
E1218 20:48:05.062430       1 jwt.go:51] token is expired by 5s
E1218 20:48:05.062457       1 token.go:57] token is expired by 5s
E1218 20:48:05.062472       1 jwt_token.go:45] token is expired by 5s
E1218 20:48:05.062484       1 authentication.go:60] Unable to authenticate the request due to error: token is expired by 5s

分别在host集群和member集群执行
kubectl -n kubesphere-system get cm kubesphere-config -o yaml | grep -v “apiVersion” | grep jwtSecret
看看两个secret是否一致

    kubectl get loginrecords.iam.kubesphere.io
    发现M集群是空的,没有同步过来,其他整M集群都是有数据的

    问题原因找到了更新一下
    m集群的kubesphere-config accessTokenMaxAge这里需要设置成0,查看源码0意味着永不超期。kubesphere-config可能有人误改成了10h,所以通过host无法登录m。
    问题虽然解决了但是逻辑还是不太清楚,10h的超期时间还不够吗?为什么m
    集群设置成10h超期,ks-apiserver会一直报token not found in cache的错误

    apiVersion: v1
    data:
      kubesphere.yaml: |
        authentication:
          authenticateRateLimiterMaxTries: 10
          authenticateRateLimiterDuration: 10m0s
          loginHistoryRetentionPeriod: 24h
          maximumClockSkew: 10s
          multipleLogin: true
          kubectlImage: kubespheredev/kubectl:v1.0.0
          jwtSecret: "XC14lrpYMKbWC30ut1UYdd2KYV0kQpqs"
          oauthOptions:
            accessTokenMaxAge: 0

      qd19zzx

      1. token 由 host cluster 签发,host cluster需要对签发的token进行缓存,以便对token进行管理(吊销)。
      2. member cluster 只需要校验 token 的签名是否正确
      3. 设置 cluster role 后ks-installer会修改member cluster 中accessTokenMaxAge这个字段