prometheus-k8s-0 启动失败,一直 pending 状态。

在 event 中看到了 Warning : didn’t find available persistent volumes to bind。没有挂载卷,请问该怎么操作,去挂载卷?还没搞懂「卷」是什么,不知道该怎么操作。

kubectl describe pod prometheus-k8s-0 -n kubesphere-monitoring-system
.
.
.
Warning  FailedScheduling  25m   default-scheduler  0/1 nodes are available: 1 node(s) didn't find available persistent volumes to bind.
kubectl get sc

NAME            PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-storage   kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  26h
kubectl get pvc -A

NAMESPACE                      NAME                                 STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS    AGE
kubesphere-monitoring-system   prometheus-k8s-db-prometheus-k8s-0   Pending                                      local-storage   26h

kubectl describe pvc prometheus-k8s-db-prometheus-k8s-0 -n kubesphere-monitoring-system
.
.
.
Normal  WaitForPodScheduled  52s (x142 over 35m)   persistentvolume-controller  waiting for pod prometheus-k8s-0 to be scheduled

下面是poddescribe信息:

Name:           prometheus-k8s-0
Namespace:      kubesphere-monitoring-system
Priority:       0
Node:           <none>
Labels:         app=prometheus
                controller-revision-hash=prometheus-k8s-85b55f7b4b
                prometheus=k8s
                statefulset.kubernetes.io/pod-name=prometheus-k8s-0
Annotations:    <none>
Status:         Pending
IP:
IPs:            <none>
Controlled By:  StatefulSet/prometheus-k8s
Containers:
  prometheus:
    Image:      prom/prometheus:v2.26.0
    Port:       9090/TCP
    Host Port:  0/TCP
    Args:
      --web.console.templates=/etc/prometheus/consoles
      --web.console.libraries=/etc/prometheus/console_libraries
      --config.file=/etc/prometheus/config_out/prometheus.env.yaml
      --storage.tsdb.path=/prometheus
      --storage.tsdb.retention.time=7d
      --web.enable-lifecycle
      --storage.tsdb.no-lockfile
      --query.max-concurrency=1000
      --web.route-prefix=/
    Limits:
      cpu:     4
      memory:  16Gi
    Requests:
      cpu:        200m
      memory:     400Mi
    Liveness:     http-get http://:web/-/healthy delay=0s timeout=3s period=5s #success=1 #failure=6
    Readiness:    http-get http://:web/-/ready delay=0s timeout=3s period=5s #success=1 #failure=120
    Environment:  <none>
    Mounts:
      /etc/prometheus/certs from tls-assets (ro)
      /etc/prometheus/config_out from config-out (ro)
      /etc/prometheus/rules/prometheus-k8s-rulefiles-0 from prometheus-k8s-rulefiles-0 (rw)
      /prometheus from prometheus-k8s-db (rw,path="prometheus-db")
      /var/run/secrets/kubernetes.io/serviceaccount from prometheus-k8s-token-k2xnk (ro)
  prometheus-config-reloader:
    Image:      kubesphere/prometheus-config-reloader:v0.42.1
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/prometheus-config-reloader
    Args:
      --log-format=logfmt
      --reload-url=http://localhost:9090/-/reload
      --config-file=/etc/prometheus/config/prometheus.yaml.gz
      --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
    Limits:
      memory:  25Mi
    Requests:
      memory:  25Mi
    Environment:
      POD_NAME:  prometheus-k8s-0 (v1:metadata.name)
    Mounts:
      /etc/prometheus/config from config (rw)
      /etc/prometheus/config_out from config-out (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from prometheus-k8s-token-k2xnk (ro)
  rules-configmap-reloader:
    Image:      jimmidyson/configmap-reload:v0.3.0
    Port:       <none>
    Host Port:  <none>
    Args:
      --webhook-url=http://localhost:9090/-/reload
      --volume-dir=/etc/prometheus/rules/prometheus-k8s-rulefiles-0
    Limits:
      memory:  25Mi
    Requests:
      memory:     25Mi
    Environment:  <none>
    Mounts:
      /etc/prometheus/rules/prometheus-k8s-rulefiles-0 from prometheus-k8s-rulefiles-0 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from prometheus-k8s-token-k2xnk (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  prometheus-k8s-db:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  prometheus-k8s-db-prometheus-k8s-0
    ReadOnly:   false
  config:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  prometheus-k8s
    Optional:    false
  tls-assets:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  prometheus-k8s-tls-assets
    Optional:    false
  config-out:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  prometheus-k8s-rulefiles-0:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      prometheus-k8s-rulefiles-0
    Optional:  false
  prometheus-k8s-token-k2xnk:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  prometheus-k8s-token-k2xnk
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  kubernetes.io/os=linux
Tolerations:     dedicated=monitoring:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  26h   default-scheduler  0/1 nodes are available: 1 node(s) didn't find available persistent volumes to bind.
  Warning  FailedScheduling  51m   default-scheduler  0/1 nodes are available: 1 node(s) didn't find available persistent volumes to bind.
  Warning  FailedScheduling  25m   default-scheduler  0/1 nodes are available: 1 node(s) didn't find available persistent volumes to bind.
  • liamhao
    kubectl get pv hsj-pv -o yaml看看你的pv如何创建的

    pv指定的node名称正确吗?本地路径存在吗?

    我明白你的环境了,建议你按照openEBS的文档或教程安装一个openEBS,并将它设为你的默认存储类,你就不用手动创建PV了。

    PV是集群比较底层的资源,在当前版本的KS中没有对用户暴露这个概念,3.2版本中会有管理PV的入口。

你需要明白Volume/PVC/PV/Provisioner之间的关系以及工作流程。

将SC的provisioner设为kubernetes.io/no-provisioner 指的是没有Provisioner会自动为PVC创建PV,参考 local volume的K8s官方示例,这意味着你需要手动创建local类型的PV,才能使得它被PVC成功绑定,进一步使得Pod成功使用PVC。

    kevendeng 感谢指点,卷这块感觉有点复杂,暂时还没啃下来,能简单写一下操作的命令吗?

      liamhao
      你的KubeSphere集群是如何安装的?KubeKey还是ks-installer?如果这个存储类不是你自己建立的,那你是使用了默认的openEBS吗?

        kevendeng k8s是通过kubeadm初始化的。用的ks-installer安装的,那个存储类是从别的地方copy过来的yaml文件生成的(因为不懂volume)。如果你能提供给我一个可用的挂载流程,我可以把现在的pv、pvc、local-storage都删掉,然后按照你给的流程操作。

        这个是 pv 和 pvc 的信息,刚才看了文档,手动建了一个pv

        还有,在KubeSphere中在哪里能看到PV的资源信息?

          liamhao
          kubectl get pv hsj-pv -o yaml看看你的pv如何创建的

          pv指定的node名称正确吗?本地路径存在吗?

          我明白你的环境了,建议你按照openEBS的文档或教程安装一个openEBS,并将它设为你的默认存储类,你就不用手动创建PV了。

          PV是集群比较底层的资源,在当前版本的KS中没有对用户暴露这个概念,3.2版本中会有管理PV的入口。

            kevendeng 感谢,我按照官方文档的示例,自建的pv是这样的:

            apiVersion: v1
            kind: PersistentVolume
            metadata:
              name: hsj-pv
              labels:
                type: local
            spec:
              storageClassName: local-storage
              capacity:
                storage: 30Gi
              accessModes:
                - ReadWriteOnce
              volumeMode: Filesystem
              persistentVolumeReclaimPolicy: Delete
              local:
                path: /mnt/disks/ssd1
              nodeAffinity:
                required:
                  nodeSelectorTerms:
                  - matchExpressions:
                    - key: kubernetes.io/hostname
                      operator: In
                      values:
                      - haosijia

            在节点的机器上创建了一个/mnt/disks/ssd1的目录,然后改了下配置中最后一行的机器名。

            现在容器可以创建了,但是下一个问题又来了……

            MountVolume.SetUp failed for volume "prometheus-k8s-token-k2xnk" : failed to sync secret cache: timed out waiting for the condition

            不吝赐教呀

            kevendeng 哈,等了会,自己就好了。。。感谢指点,一会我再看看 openEBS 是啥。

              liamhao
              openEBS是一个自动为PVC创建local PV的Provisioner

              MountVolume.SetUp failed for volume "prometheus-k8s-token-k2xnk" : failed to sync secret cache: timed out waiting for the condition

              kubelet在创建Pod时,如secret和configmap等Volume是通过HTTP从apiserver获取到的,这个错误说明kubelet到kube-apiserver的连接超时了,所以可能你稍等一下就好了。