notifivation-manager接入微信报警收不到信息
WwanjunleiK零S
kubectl edit notificationmanagers.notification.kubesphere.io -n kubesphere-monitoring-system notification-manager
添加
spec:
args:
- --log.level=debug
把日志级别设为debug,如果有发送email通知会有日志打印的。
没收到通知就几种可能
1.alertmanager没发,看看alertmanager有没有告警,看看alertmanager的日志,启动时会打印配置的webhook,看看是不是没有配置成功,或者发送时发送失败
2.notification manager发送失败,这个看notification-manager-deploy的日志
3.企业微信的接口调用有次数限制,超过之后接口调用不会返回错误,也会收不到通知
4.企业微信的配置不正确,这个你自己检查一下
你先把这些排查一下
wanjunlei 老师,我昨晚回去翻看您配置微信报警的视频,排查结果如下:
1、将报警级别调整到debug,但测试创建cm还没有收到微信信息;
2、altermanager的log如下:(看日志都是正常的)
3、notification-manager-deploy的日志如下:
4、我新建的企业微信应用,没调用几次;
5、企业微信配置,我参考您视频中全局配置:
apiVersion: notification.kubesphere.io/v1alpha1
kind: WechatConfig
metadata:
name: default-wechat-config
namespace: kubesphere-monitoring-system
labels:
app: notification-manager
type: default
spec:
wechatApiUrl: https://qyapi.weixin.qq.com/cgi-bin/
wechatApiSecret:
key: wechat
name: default-wechat-secret
wechatApiCorpId: ww69a126a1fe458507
wechatApiAgentId: "1000144"
---
apiVersion: notification.kubesphere.io/v1alpha1
kind: WechatReceiver
metadata:
name: global-wechat-receiver
namespace: kubesphere-monitoring-system
labels:
app: notification-manager
type: global
spec:
# wechatConfigSelector needn't to be configured for a global receiver
# optional
# One of toUser, toParty, toParty should be specified.
toUser: ww69a126a1fe458507
#toParty: global
#toTag: global
---
apiVersion: v1
data:
wechat: M2RQMmxREnpTQ2VOSmp5VTJWLWFlTTh6bUltWER6OFhaOHJFSG8yNV9ycw==
kind: Secret
metadata:
labels:
app: notification-manager
name: default-wechat-secret
namespace: kubesphere-monitoring-system
type: Opaque
WwanjunleiK零S
在alertmanager-main的secret中设置
"global":
"resolve_timeout": "5m"
"inhibit_rules":
- "equal":
- "namespace"
- "alertname"
"source_match":
"severity": "critical"
"target_match_re":
"severity": "warning|info"
- "equal":
- "namespace"
- "alertname"
"source_match":
"severity": "warning"
"target_match_re":
"severity": "info"
"receivers":
- "name": "Default"
- "name": "Watchdog"
- "name": "Critical"
- "name": "prometheus"
"webhook_configs":
- "url": "http://notification-manager-svc.kubesphere-monitoring-system.svc:19093/api/v2/alerts"
- "name": "event"
"webhook_configs":
- "url": "http://notification-manager-svc.kubesphere-monitoring-system.svc:19093/api/v2/alerts"
- "name": "auditing"
"webhook_configs":
- "send_resolved": false
"url": "http://notification-manager-svc.kubesphere-monitoring-system.svc:19093/api/v2/alerts"
- "name": "notification-manager"
"webhook_configs":
- "url": "http://notification-manager-svc.kubesphere-monitoring-system.svc:19093/api/v2/alerts"
"route":
"group_by":
- "namespace"
- "alertname"
"group_interval": "5m"
"group_wait": "30s"
"receiver": "Default"
"repeat_interval": "12h"
"routes":
- "match":
"alertname": "Watchdog"
"receiver": "Watchdog"
- "match":
"severity": "critical"
"receiver": "Critical"
- "match":
"alerttype": ""
"receiver": "prometheus"
- "group_interval": "30s"
"match":
"alerttype": "event"
"receiver": "event"
- "group_interval": "30s"
"match":
"alerttype": "auditing"
"receiver": "auditing"
- "group_interval": "30s"
"match":
"alerttype": "notification-manager"
"receiver": "notification-manager"
我现在排查了一遍,发现都是正常的,就是微信接收不到信息,使用email在ui上配置可以接收到报警信息请问有完整的教程吗?
WwanjunleiK零S
notification manager有收到am发送的告警吗?
WwanjunleiK零S
xiaoyao 创建cm是不能产生告警的你需要启用auditing,然后kubectl edit awh kube-auditing-webhook
修改alertingPriority为INFO,然后通过ui创建cm这样才能产生告警
WwanjunleiK零S
你调用下http://notification-manager-svc.kubesphere-monitoring-system.svc:19093/receivers,这会返回你的receiver,看看receiver到底有没有设置成功
这个是我的nm配置文件:
apiVersion: notification.kubesphere.io/v1alpha1
kind: NotificationManager
metadata:
name: notification-manager
namespace: kubesphere-monitoring-system
spec:
replicas: 1
resources:
limits:
cpu: 500m
memory: 1Gi
requests:
cpu: 100m
memory: 20Mi
image: kubesphere/notification-manager:v0.6.0
imagePullPolicy: IfNotPresent
serviceAccountName: notification-manager-sa
portName: webhook
defaultConfigSelector:
matchLabels:
type: default
receivers:
tenantKey: user
globalReceiverSelector:
matchLabels:
type: global
tenantReceiverSelector:
matchLabels:
type: tenant
options:
global:
templateFile:
- /etc/notification-manager/template
email:
notificationTimeout: 5
deliveryType: bulk
slack:
notificationTimeout: 5
wechat:
notificationTimeout: 5
webhook:
notificationTimeout: 5
dingtalk:
notificationTimeout: 5
volumeMounts:
- mountPath: /etc/notification-manager/
name: noification-manager-template
volumes:
- configMap:
defaultMode: 420
name: noification-manager-template
name: noification-manager-template
我把Alertmanager服务暴露出来后,网页访问是可以看到我创建/删除cm的报警信息的,说明alertmanager是可以接收到信息,但是nm好像不能接收到报警信息,我按照官网配置了receivers,如下:
"receivers":
- "name": "Default"
- "name": "Watchdog"
- "name": "Critical"
- "name": "prometheus"
"webhook_configs":
- "url": "http://notification-manager-svc.kubesphere-monitoring-system.svc:19093/api/v2/alerts"
- "name": "event"
"webhook_configs":
- "url": "http://notification-manager-svc.kubesphere-monitoring-system.svc:19093/api/v2/alerts"
- "name": "auditing"
"webhook_configs":
- "send_resolved": false
"url": "http://notification-manager-svc.kubesphere-monitoring-system.svc:19093/api/v2/alerts"
# 上面是默认的,下面是我的
- "name": "notification-manager"
"webhook_configs":
- "url": "http://notification-manager-svc.kubesphere-monitoring-system.svc:19093/api/v2/alerts"
同时配置了route,如下:
- "group_interval": "30s"
"match":
"alerttype": "notification-manager"
"receiver": "notification-manager"
请问老师上面的alertmanage发送报警信息给nm的配置正确吗?万分感谢~
WwanjunleiK零S
贴下nm的日志
# kubectl logs notification-manager-deployment-5675dc9d7-8ts8f -n kubesphere-monitoring-system --tail 500
level=info ts=2021-03-11T14:55:40.804736133+08:00 caller=main.go:100 msg="Starting notification manager..." addr=:19093 timeout=5s
level=info ts=2021-03-11T14:55:43.523676264+08:00 caller=config.go:340 msg="Setting up informers successfully"
level=info ts=2021-03-11T14:56:01.080585465+08:00 caller=config.go:679 msg="resource change" op=add type=wechat name=default-wechat-config namespace=kubesphere-monitoring-system
level=info ts=2021-03-11T14:56:01.094736472+08:00 caller=config.go:679 msg="resource change" op=add type=wechat name=global-wechat-receiver namespace=kubesphere-monitoring-system
# kubectl logs notification-manager-operator-dd4c6fd6f-pqtrc -n kubesphere-monitoring-system -c notification-manager-operator --tail 500
2021-03-11T14:55:21.665+0800 INFO controller-runtime.metrics metrics server is starting to listen {"addr": "127.0.0.1:8080"}
2021-03-11T14:55:21.666+0800 INFO setup starting manager
I0311 14:55:21.666557 1 leaderelection.go:242] attempting to acquire leader lease kubesphere-monitoring-system/7b8d27e6.kubesphere.io...
2021-03-11T14:55:21.670+0800 INFO controller-runtime.manager starting metrics server {"path": "/metrics"}
I0311 14:55:38.534004 1 leaderelection.go:252] successfully acquired lease kubesphere-monitoring-system/7b8d27e6.kubesphere.io
2021-03-11T14:55:38.534+0800 DEBUG controller-runtime.manager.events Normal {"object": {"kind":"ConfigMap","namespace":"kubesphere-monitoring-system","name":"7b8d27e6.kubesphere.io","uid":"cee2a831-d2e0-45de-87af-b4813c744676","apiVersion":"v1","resourceVersion":"904756"}, "reason": "LeaderElection", "message": "notification-manager-operator-dd4c6fd6f-pqtrc_e30b19a1-2d0c-4038-a604-3c23f7ea9158 became leader"}
2021-03-11T14:55:38.536+0800 INFO controller-runtime.controller Starting EventSource {"controller": "notificationmanager", "source": "kind source: /, Kind="}
2021-03-11T14:55:38.837+0800 INFO controller-runtime.controller Starting EventSource {"controller": "notificationmanager", "source": "kind source: /, Kind="}
2021-03-11T14:55:38.837+0800 INFO controller-runtime.controller Starting Controller {"controller": "notificationmanager"}
2021-03-11T14:55:38.837+0800 INFO controller-runtime.controller Starting workers {"controller": "notificationmanager", "worker count": 1}
2021-03-11T14:55:38.964+0800 DEBUG controller-runtime.controller Successfully Reconciled {"controller": "notificationmanager", "request": "kubesphere-monitoring-system/notification-manager"}
2021-03-11T14:55:39.159+0800 ERROR controllers.NotificationManager Failed to CreateOrUpdate deployment {"NotificationManager Operator": "kubesphere-monitoring-system/notification-manager", "result": "unchanged", "error": "Operation cannot be fulfilled on deployments.apps \"notification-manager-deployment\": the object has been modified; please apply your changes to the latest version and try again"}
github.com/go-logr/zapr.(*zapLogger).Error
/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128
github.com/kubesphere/notification-manager/pkg/controllers.(*NotificationManagerReconciler).Reconcile
/workspace/pkg/controllers/notificationmanager_controller.go:94
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.5.2/pkg/internal/controller/controller.go:256
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.5.2/pkg/internal/controller/controller.go:232
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.5.2/pkg/internal/controller/controller.go:211
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
/go/pkg/mod/k8s.io/apimachinery@v0.17.2/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
/go/pkg/mod/k8s.io/apimachinery@v0.17.2/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
/go/pkg/mod/k8s.io/apimachinery@v0.17.2/pkg/util/wait/wait.go:88
2021-03-11T14:55:39.160+0800 ERROR controller-runtime.controller Reconciler error {"controller": "notificationmanager", "request": "kubesphere-monitoring-system/notification-manager", "error": "Operation cannot be fulfilled on deployments.apps \"notification-manager-deployment\": the object has been modified; please apply your changes to the latest version and try again"}
github.com/go-logr/zapr.(*zapLogger).Error
/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.5.2/pkg/internal/controller/controller.go:258
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.5.2/pkg/internal/controller/controller.go:232
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.5.2/pkg/internal/controller/controller.go:211
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
/go/pkg/mod/k8s.io/apimachinery@v0.17.2/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
/go/pkg/mod/k8s.io/apimachinery@v0.17.2/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
/go/pkg/mod/k8s.io/apimachinery@v0.17.2/pkg/util/wait/wait.go:88
2021-03-11T14:55:40.188+0800 DEBUG controller-runtime.controller Successfully Reconciled {"controller": "notificationmanager", "request": "kubesphere-monitoring-system/notification-manager"}
2021-03-11T14:55:41.420+0800 DEBUG controller-runtime.controller Successfully Reconciled {"controller": "notificationmanager", "request": "kubesphere-monitoring-system/notification-manager"}
WwanjunleiK零S
notification manager没收到消息,问题应该是出在am那边,你重启下am,把全部的日志贴下