kubesphere集群中通过helm install或kubectl apply部署应用时报错:failed install perform step: Internal error occurred: failed calling webhook “pilot.validation.istio.io”: Post https://istio-galley.istio-system.svc:443/admitpilot?timeout=30s: x509: certificate has expired or is not yet valid
解决办法:
重新进行istio证书的签发。
在kubesphere的界面上,找到istio-system项目,配置中心->密钥: 删除istio-ca-secret,然后重启Citadel.
实际操作中,发现Citade不能重启,报:
error Failed to create a self-signed Citadel (error: failed to create CA KeyCertBundle (failed to parse cert PEM: invalid PEM encoded certificate))
原因是 istio-ca-secret删除后,istio自建istio-ca-secret,它里面的证书数据不对,这么手工修改: ca-cert.pem 对应的值是 /etc/kubernetes/pki/ca.crt中的内容,ca-key.pem对应的是/etc/kubernetes/pki/ca.key中的内容,其他三个 root-cert.pem、key.pem、cert-chain.pem的内容均为空,更新后,重启citadel就可以了。
重启citadel并成功后,执行 helm install还是同样的错误,这时还需要重启istio-galley,方法将所有istio-galley的副本变为0再改回去原值,重启成功后,问题解决。
参考资料:
查到 https://github.com/istio/istio/issues/17718 有类似问题
文中提到:
John has set Galley with short expiration time (3 hours). The Galley can reload the rotated TLS certs (which are local files mounted from secrets) in V1.1.12, but it seems the reloading doesn’t work after upgrading to 1.1.13 and 1.1.14.
So Galley’s existing X509 cert is expired and new workloads won’t be deployed due to the webhook authn failure.
My suspicion is that Galley webhooks’s reloading cert feature is broken starting on V1.1.13. To fully verify, John has restarted Galley (which temporarily solves this issue because the cert is reloaded) and we will check if Galley’s cert will get rotated in 3 hours.
附注: 在Istio比较早期的版本中,自签名Ca证书有效期只有一年时间,如果使用老版本Istio超过一年,就会遇到这个问题。当证书过期之后,我们创建新的虚拟服务或者pod,都会因为CA证书过期而失败。而这时如果Citadel重启,它会读取过期证书并验证其有效性,就会出现以上Cidatel不能启动的问题。
这个Ca证书在K8s集群中,是以istio-ca-secret命名的secret,我们可以使用openssl解码证书来查看有效期。这个问题比较简单的处理方法,就是删除这个Secret,并重启Citadel,这时Citadel会走向新建和验证自签名Ca证书的逻辑并刷新Ca证书。