EeeokeeK零S
- 已编辑
问题描述:
丢失3月13日 8:00-3月14日8:00,3月16日8:00-3月17日8:00 两天的日志数据。这两天所有的日志都没有记录。直接进应用容器下载容器日志是可以看到日志数据的。
软件版本:
- kubernetes v1.18.6
- kubesphere v3.0.0
问题描述:
丢失3月13日 8:00-3月14日8:00,3月16日8:00-3月17日8:00 两天的日志数据。这两天所有的日志都没有记录。直接进应用容器下载容器日志是可以看到日志数据的。
软件版本:
看看fluent-bit的日志有没有报错,describe一下fluent-bit的pod,看看有没有异常
exec进es的pod,执行
curl -XGET ‘elasticsearch-logging-data.kubesphere-logging-system.svc:9200/_cat/indices?v&pretty’
看看这两天的日志文件有没有生成
wanjunlei
感谢您的回复。我今天查了一下,如下;
curl -XGET ‘elasticsearch-logging-data.kubesphere-logging-system.svc:9200/_cat/indices?v&pretty’
1.elasticsearch 监控状态很多是red,和这个有关吗?
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
red open ks-logstash-log-2021.03.16 Y0FjUrFiRiWy9cAfWUHpgg 5 1
red open ks-logstash-log-2021.03.14 2XSxytTSSuKPdBr0FL6W4Q 5 1 5922651 0 3.3gb 1.6gb
green open ks-logstash-events-2021.03.14 yF7anY2tTjCfzkRBk8Iphw 5 1 29580 0 28.7mb 14.1mb
green open ks-logstash-log-2021.03.12 bVno4bDmTLuvAD1bPBOmaQ 5 1 18434271 0 6.7gb 3.3gb
green open ks-logstash-events-2021.03.11 WzbEsVJfRDiI3pojRPwOUg 5 1 30816 0 31.7mb 15.8mb
green open ks-logstash-auditing-2021.03.12 KLC2HckyQ0WZYyZjZ6GsOA 5 1 692 0 1.4mb 718kb
green open ks-logstash-auditing-2021.03.17 zawsVkaMQW6ViKV7Wqf7Hg 5 1 549 0 1.1mb 612.6kb
red open ks-logstash-log-2021.03.17 kgBpEq8OReOmi9YV3fvu6A 5 1
green open ks-logstash-auditing-2021.03.16 Uc3xeiJfRqiXxucTejRlfQ 5 1 549 0 1.1mb 602.7kb
green open ks-logstash-auditing-2021.03.14 XtEXtB5_Rr-uJVb_Inr8CQ 5 1 487 0 1.4mb 811.4kb
green open ks-logstash-auditing-2021.03.13 yoXhgBL1RZeER5iJgcgDaw 5 1 520 0 1.4mb 674.8kb
green open ks-logstash-events-2021.03.12 J8PwiPiBRHeXTHvfoBkp7Q 5 1 31213 0 32.6mb 16.3mb
red open ks-logstash-log-2021.03.15 GqcUifwkTnu8Lr5aawCf4w 5 1 3337233 0 2.4gb 1.2gb
green open ks-logstash-auditing-2021.03.11 2XT_F1u9TDOR1cu7g0fVkA 5 1 631 0 1.4mb 758.6kb
green open ks-logstash-events-2021.03.15 szTux1M6TUWRU12Yi3tkaA 5 1 30018 0 30.1mb 15mb
green open ks-logstash-events-2021.03.17 mGEiBV5hRIy7lPB2qqFgqA 5 1 31547 0 33mb 16.5mb
red open ks-logstash-log-2021.03.13 H_ALKzDWQbC5dQZ1iIo31g 5 1
green open ks-logstash-events-2021.03.18 CLzOOeKJQYWzWlOmIx5iHQ 5 1 1985 0 6.9mb 3.4mb
green open ks-logstash-log-2021.03.18 sEjFbVuCSuGK7lON-d5SbA 5 1 421109 0 242.5mb 121.2mb
green open ks-logstash-events-2021.03.13 EVXRZr_tRY-1R92VcskvHw 5 1 29626 0 29.1mb 14.5mb
green open ks-logstash-auditing-2021.03.15 80McbVbpQ5CDxwndVKACEQ 5 1 530 0 1mb 554.2kb
green open ks-logstash-auditing-2021.03.18 -dAHB1w_ROOIW5OM18XOPw 5 1 42 0 1mb 590.5kb
red open ks-logstash-log-2021.03.11 dBBTnEuuQs-rFmFKGAXlWQ 5 1 17988973 0 6.5gb 3.2gb
green open ks-logstash-events-2021.03.16 v3W_GV_nS--SPAQcV9Emhw 5 1 30963 0 32mb 16mb
文字格式有点乱,补充截图
2.fluentbit-operator 看日志似乎OOM了,是因为这个吗,如果是怎么调整
kubectl describe pod -n kubesphere-logging-system fluentbit-operator-855d4b977d-xwjxs
Name: fluentbit-operator-855d4b977d-xwjxs
Namespace: kubesphere-logging-system
Priority: 0
Node: k8s-node2/192.168.0.175
Start Time: Fri, 25 Sep 2020 02:30:14 +0800
Labels: app.kubernetes.io/component=operator
app.kubernetes.io/name=fluentbit-operator
pod-template-hash=855d4b977d
Annotations: cni.projectcalico.org/podIP: 10.233.76.223/32
cni.projectcalico.org/podIPs: 10.233.76.223/32
Status: Running
IP: 10.233.76.223
IPs:
IP: 10.233.76.223
Controlled By: ReplicaSet/fluentbit-operator-855d4b977d
Init Containers:
setenv:
Container ID: docker://a121b6a47de7a842a45be2a45792ecebcb5b3bf487a809116e8edfec9c487417
Image: docker:19.03
Image ID: docker-pullable://docker@sha256:57ddfc5b9f4f89f1598440cd1d6d97b87532b0bce1315e7880ae6843e3583529
Port: <none>
Host Port: <none>
Command:
/bin/sh
-c
set -ex; echo DOCKER_ROOT_DIR=$(docker info -f {{.DockerRootDir}}) > /fluentbit-operator/fluent-bit.env
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 04 Jan 2021 07:41:53 +0800
Finished: Mon, 04 Jan 2021 07:42:00 +0800
Ready: True
Restart Count: 4
Environment: <none>
Mounts:
/fluentbit-operator from env (rw)
/var/run/docker.sock from dockersock (ro)
/var/run/secrets/kubernetes.io/serviceaccount from fluentbit-operator-token-kslrx (ro)
Containers:
fluentbit-operator:
Container ID: docker://8bcad2f7d504bb24aca63b0613f960756af768e67d24ea56f27361a8532f674a
Image: kubesphere/fluentbit-operator:v0.2.0
Image ID: docker-pullable://kubesphere/fluentbit-operator@sha256:914864f8d56931274554432d6e6674799c05284aa8c88ff72ae1a9a04a4dc873
Port: <none>
Host Port: <none>
State: Running
Started: Wed, 17 Mar 2021 11:23:43 +0800
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Fri, 12 Mar 2021 16:17:26 +0800
Finished: Wed, 17 Mar 2021 11:23:42 +0800
Ready: True
Restart Count: 74
Limits:
cpu: 100m
memory: 30Mi
Requests:
cpu: 100m
memory: 20Mi
Environment: <none>
Mounts:
/fluentbit-operator from env (rw)
/var/run/secrets/kubernetes.io/serviceaccount from fluentbit-operator-token-kslrx (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
env:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
dockersock:
Type: HostPath (bare host directory volume)
Path: /var/run/docker.sock
HostPathType:
fluentbit-operator-token-kslrx:
Type: Secret (a volume populated by a Secret)
SecretName: fluentbit-operator-token-kslrx
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
kubectl describe pod -n kubesphere-logging-system fluent-bit-9dbnr
Name: fluent-bit-9dbnr
Namespace: kubesphere-logging-system
Priority: 0
Node: k8s-master1/192.168.0.171
Start Time: Tue, 08 Sep 2020 10:01:13 +0800
Labels: app.kubernetes.io/name=fluent-bit
controller-revision-hash=d8f95598f
pod-template-generation=2
Annotations: cni.projectcalico.org/podIP: 10.233.68.104/32
cni.projectcalico.org/podIPs: 10.233.68.104/32
Status: Running
IP: 10.233.68.104
IPs:
IP: 10.233.68.104
Controlled By: DaemonSet/fluent-bit
Containers:
fluent-bit:
Container ID: docker://1356ca4635d396ee9251a5992991e0b8337e9d3969607e9e899678f3eba5d5e9
Image: kubesphere/fluent-bit:v1.4.6
Image ID: docker-pullable://kubesphere/fluent-bit@sha256:1007b7cb7090435bf5b5d04f07cf6982d841597218fb67e291b2606c4e25b3e2
Port: 2020/TCP
Host Port: 0/TCP
State: Running
Started: Thu, 18 Mar 2021 03:00:01 +0800
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Wed, 17 Mar 2021 03:00:02 +0800
Finished: Thu, 18 Mar 2021 03:00:01 +0800
Ready: True
Restart Count: 72
Environment: <none>
Mounts:
/fluent-bit/config from config (ro)
/fluent-bit/tail from positions (rw)
/var/lib/docker/containers from varlibcontainers (ro)
/var/log/ from varlogs (ro)
/var/run/secrets/kubernetes.io/serviceaccount from fluent-bit-token-wc9x4 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
varlibcontainers:
Type: HostPath (bare host directory volume)
Path: /var/lib/docker/containers
HostPathType:
config:
Type: Secret (a volume populated by a Secret)
SecretName: fluent-bit-config
Optional: false
varlogs:
Type: HostPath (bare host directory volume)
Path: /var/log
HostPathType:
positions:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
fluent-bit-token-wc9x4:
Type: Secret (a volume populated by a Secret)
SecretName: fluent-bit-token-wc9x4
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations:
node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/pid-pressure:NoSchedule
node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unschedulable:NoSchedule
Events: <none>
es的index状态不正常,导致日志无法写入。你看下是不是es的存储空间满了