生产环境已经跑了一段时间,今天更新了点东西,流水线已经完整跑完,部署到k8s的时候,发现deployment部署失败了,失败原因在控制台那里报错如下:
状态分析(Conditions)
可用性(Available)
状态: True

原因: 最小副本可用(MinimumReplicasAvailable)

消息: Deployment has minimum availability.

ReplicaFailure
状态: True

原因: 创建失败(FailedCreate)

消息: Internal error occurred: failed calling webhook “logsidecar-injector.logging.kubesphere.io”: Post https://logsidecar-injector-admission.kubesphere-logging-system.svc:443/?timeout=30s: service “logsidecar-injector-admission” not found

创建进度(Progressing)
状态: True

原因: 已创建新副本(NewReplicaSetCreated)

消息: Created new replica set “apos-web-prod-58d44874d”

查看deployment的日志,如下:
Name: apos-timer-prod
Namespace: apos
CreationTimestamp: Fri, 21 Aug 2020 15:49:43 +0800
Labels: app=apos-timer-prod
component=apos
tier=backend
Annotations: deployment.kubernetes.io/revision: 2
Selector: app=apos-timer-prod,tier=backend
Replicas: 2 desired | 0 updated | 0 total | 0 available | 3 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 15
RollingUpdateStrategy: 1 max unavailable, 1 max surge
Pod Template:
Labels: app=apos-timer-prod
component=apos
tier=backend
Annotations: kubesphere.io/restartedAt: 2020-08-21T09:39:11.937Z
Containers:
apos-timer-prod:
Image: harbor.devops.kubesphere.local:30280/library/apos-timer-prod:SNAPSHOT-product-17
Port: 7502/TCP
Host Port: 0/TCP
Limits:
cpu: 2
memory: 2Gi
Requests:
cpu: 100m
memory: 200Mi
Readiness: tcp-socket :7502 delay=5s timeout=10s period=10s #success=1 #failure=3
Environment:
SPRING_PROFILES_ACTIVE: prod
Mounts:
/etc/localtime from host-time (rw)
Volumes:
host-time:
Type: HostPath (bare host directory volume)
Path: /etc/localtime
HostPathType:

Conditions:
Type Status Reason


Available False MinimumReplicasUnavailable
ReplicaFailure True FailedCreate
Progressing True NewReplicaSetCreated
OldReplicaSets: apos-timer-prod-7dc997b6d6 (0/1 replicas created)
NewReplicaSet: apos-timer-prod-6ffd774f54 (0/2 replicas created)
Events:
Type Reason Age From Message


Normal ScalingReplicaSet 31s deployment-controller Scaled up replica set apos-timer-prod-6ffd774f54 to 1
Normal ScalingReplicaSet 31s deployment-controller Scaled down replica set apos-timer-prod-7dc997b6d6 to 1
Normal ScalingReplicaSet 31s deployment-controller Scaled up replica set apos-timer-prod-6ffd774f54 to 2

看了所有组件,都是正常,这个得咋整呀,麻烦各位大佬帮忙看下,谢谢

  • hongming 回复了此帖
  • xingye311 kubectl delete MutatingWebhookConfiguration logsidecar-injector-admission-mutate删除掉这个3.0.0版本的对象

    xingye311 什么版本, kubectl -n kubesphere-logging-system get svc 看看

    service “logsidecar-injector-admission” not found

      hongming ks 2.1.1的,执行后显示如下:
      [root@master1 ]# kubectl -n kubesphere-logging-system get svc
      NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
      elasticsearch-logging-data ClusterIP 10.68.70.17 <none> 9200/TCP 31d
      elasticsearch-logging-discovery ClusterIP None <none> 9300/TCP 31d
      logsidecar-injector ClusterIP 10.68.212.23 <none> 443/TCP 31d

        xingye311 这个service名称由ks2.1.1的logsidecar-injector调整为ks3.0.0的logsidecar-injector-admission,你这个2.1.1环境怎么会去请求logsidecar-injector-admission

          xulai 上个月装过3.0的,但是没有正式GA,又急着用,所以后面卸载了3.0后装了2.1.1,这个有什么影响么

            xulai 之前装好后,已经稳定跑了一个月了,流水线部署也没有问题的,就今天要更新东西,才发现这个情况

            xingye311 那应该是没有卸载干净,你检查一下MutatingWebhookConfiguration,应该有3.0.0的logsidecar-injector-admission-xx。你这次的负载是有收集落盘日志吧

              xulai真的有这个东东哦,现在要怎么处理呀
              root@master1 ]# kubectl get MutatingWebhookConfiguration
              NAME CREATED AT
              istio-sidecar-injector 2020-07-21T07:19:44Z
              logsidecar-injector 2020-07-20T11:05:38Z
              logsidecar-injector-admission-mutate 2020-07-10T01:48:08Z
              mutating-webhook-configuration 2020-07-10T02:26:01Z

                xingye311 kubectl delete MutatingWebhookConfiguration logsidecar-injector-admission-mutate删除掉这个3.0.0版本的对象

                  xulai 真棒,可以了,感谢大佬,万分感谢,刚差点要被祭天了