3.0 启用告警通知后，怎么添加邮件以外的通知方式

xulai

tzghost
看下alertmanager有没有notification-manager-operator-6958786cd6-qmck2的告警：
curl alertmanager-main.kubesphere-monitoring-system.svc:9093/api/v2/alerts?filter=pod=notification-manager-operator-6958786cd6-qmck2查看，或者暴露出kubesphere-monitoring-system/alertmanager-main服务的9093端口，访问端口对应页面查看

tzghost

xulai

http://192.168.0.55:31289/api/v2/alerts?filter=pod=notification-manager-operator-6958786cd6-qmck2
·
[{“annotations”:{“message”:“Back-off restarting failed container”,“summary”:“Container back-off”,“summaryCn”:“容器回退”},“endsAt”:“2020-10-15T11:07:05.068Z”,“fingerprint”:“594b1ab1a9d84ff6”,“receivers”:[{“name”:“event”}],“startsAt”:“2020-10-15T11:02:05.068Z”,“status”:{“inhibitedBy”:[],“silencedBy”:[],“state”:“active”},“updatedAt”:“2020-10-15T11:02:05.068Z”,“labels”:{“alertname”:“ContainerBackoff”,“alerttype”:“event”,“container”:“notification-manager-operator”,“namespace”:“kubesphere-monitoring-system”,“pod”:“notification-manager-operator-6958786cd6-qmck2”,“severity”:“warning”}}]·

benjaminhuo

notification-manager-operator 是个 deployment，删了pod 当然还会重新创建
另外这个组件是从 Alertmanager 接收微信、slack告警必须的，不能删
你删它的意图是什么？

tzghost

benjaminhuo 是因为notification-manager-operator-6958786cd6-qmck2这个POD一直有ContainerBackoff的告警，所以尝试删除重建。按理说删除重建后原来的POD已经不存在了，告警应该会恢复的，但这个告警还一直有，比较奇怪

xulai

tzghost 看下kubesphere-logging-system/ks-events-ruler负载的日志是否有异常

tzghost

xulai 是有报错

xulai

tzghost 这个负载下的pod日志都是这样么

tzghost

xulai 是的

xulai

tzghost 看下kubesphere-monitoring-system下的alertmanager负载的所有pod ip

tzghost

xulai IP已经变了

xulai

tzghost 你在alertmanager ui界面点下status那一栏确认一下peer的所有ip

tzghost

xulai 和上面是一致的

xulai

tzghost 你重启kubesphere-logging-system/ks-events-ruler负载让它恢复正常。经验证是个bug，我这边记录一下，接下来修复

tzghost

xulai 好的，感谢回复

xulai

tzghost 现在你可以通过kubectl -n kubesphere-logging-system edit ruler ks-events-ruler更新其中的镜像版本到v0.2.0(该版本已修复上边的bug)

tzghost

xulai 已修复，感谢

« 上一页