Kube-Eventer的開掛操作
本文轉(zhuǎn)載自微信公眾號「運維開發(fā)故事」,作者沒有文案的夏老師。轉(zhuǎn)載本文請聯(lián)系運維開發(fā)故事公眾號。
離線事件告警
kube-eventer是由阿里開源的k8s離線事件收集器,開源地址
https://github.com/AliyunContainerService/kube-eventer/blob/master/docs/en/webhook-sink.md
在Kubernetes中,事件分為兩種,一種是Warning事件,表示產(chǎn)生這個事件的狀態(tài)轉(zhuǎn)換是在非預(yù)期的狀態(tài)之間產(chǎn)生的;另外一種是Normal事件,表示期望到達的狀態(tài),和目前達到的狀態(tài)是一致的。
我們以NPD的event來講解。事件影響節(jié)點的臨時性問題,但是它是對于系統(tǒng)診斷是有意義的。NPD就是利用kubernetes的上報機制,通過檢測系統(tǒng)的日志(例如centos中journal),把錯誤的信息上報到kuberntes的node上。這些日志(例如內(nèi)核日志)中噪音信息太多,NPD會提取其中有價值的信息,可以將這些信息生成離線事件。這樣我就可以得到node上的時間,及時進行處理。
一個標準的Kubernetes事件有如下幾個重要的屬性,通過這些屬性可以更好地診斷和告警問題。Namespace:產(chǎn)生事件的對象所在的命名空間。
Kind:綁定事件的對象的類型,例如:Node、Pod、Namespace、Componenet等等。
Timestamp:事件產(chǎn)生的時間等等。
Reason:產(chǎn)生這個事件的原因。Message: 事件的具體描述。
目前的sinks支持大致如下:
Sink Name | Description |
---|---|
dingtalk | sink to dingtalk bot |
sls | sink to alibaba cloud sls service |
elasticsearch | sink to elasticsearch |
honeycomb | sink to honeycomb |
influxdb | sink to influxdb |
kafka | sink to kafka |
mysql | sink to mysql database |
sink to wechat |
今天主要帶來webhook的開掛技巧。首先看支持的參數(shù):
- level - Level of event (optional. default: Warning. Options: Warning and Normal)
- namespaces - Namespaces to filter (optional. default: all namespaces,use commas to separate multi namespaces, namespace filter doesn't support regexp)
- kinds - Kinds to filter (optional. default: all kinds,use commas to separate multi kinds. Options: Node,Pod and so on.)
- reason - Reason to filter (optional. default: empty, Regexp pattern support). You can use multi reason fields in query.
- method - Method to send request (optional. default: GET)
- header - Header in request (optional. default: empty). You can use multi header field in query.
- custom_body_configmap - The configmap name of request body template. You can use Template to customize request body. (optional.)
- custom_body_configmap_namespace - The configmap namespace of request body template.
如果每個項目namespace與負責人是一一對應(yīng)的,就可以根據(jù)configmap與sink關(guān)聯(lián)起來。變更上線部署是最容易出現(xiàn)事件的時候,通過事件是可以快速的發(fā)現(xiàn)上線的鏡像tag錯誤,鏡像配置錯誤等問題。
首先configmap,通過custom_body_configmap的值來選擇不同的配置文件??梢院唵涡揎椧幌?,使其變得更加清晰。
添加加Cluster:name可以知道是哪個集群的event。
添加加"mentioned_list":["wangqin","@all"]可以@對應(yīng)的負責人。
- ---
- apiVersion: v1
- data:
- content: >-
- {"msgtype": "text","text": {"content": "Cluster:name\nEventType:{{ .Type }}\nEventNamespace:{{ .InvolvedObject.Namespace }}\nEventKind:{{ .InvolvedObject.Kind }}\nEventObject:{{ .InvolvedObject.Name }}\nEventReason:{{ .Reason }}\nEventTime:{{ .LastTimestamp }}\nEventMessage:{{ .Message }}","mentioned_list":["wangqing","@all"]}}
- kind: ConfigMap
- metadata:
- name: custom-webhook-body
- namespace: nameapce
命令部分的技巧
sink是一個數(shù)組,可以加很多條。
主要說明用webhook向企業(yè)微信的的通知。注意reason是可以支持正則表達式的。通過configmap就一起完成了k8s機器的事件告警。
- command:
- - "/kube-eventer"
- - "--source=kubernetes:https://kubernetes.default"
- ## .e.g,dingtalk sink demo
- - --sink=webhook:https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxxxx&level=Warning&reason=[^Unhealthy]&namespaces=xxxx&header=Content-Type=application/json&custom_body_configmap=custom-webhook-body0&custom_body_configmap_namespace=xxxx&method=POST
- - --sink=webhook:https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxxxx&level=Warning&reason=BackOff&namespaces=xxxx&header=Content-Type=application/json&custom_body_configmap=custom-webhook-body1&custom_body_configmap_namespace=xxxx&method=POST
- - --sink=webhook:https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxxxx&level=Warning&reason=Failed&namespaces=xxxx&header=Content-Type=application/json&custom_body_configmap=custom-webhook-body2&custom_body_configmap_namespace=xxxxx&method=POST
案列:
創(chuàng)建一個企業(yè)微信群的機器人。比如:https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxxxx。
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- labels:
- name: kube-eventer
- name: kube-eventer
- namespace: namespace
- spec:
- replicas: 1
- selector:
- matchLabels:
- app: kube-eventer
- template:
- metadata:
- labels:
- app: kube-eventer
- annotations:
- scheduler.alpha.kubernetes.io/critical-pod: ''
- spec:
- dnsPolicy: ClusterFirstWithHostNet
- serviceAccount: kube-eventer
- containers:
- - image: registry.aliyuncs.com/acs/kube-eventer-amd64:v1.2.0-484d9cd-aliyun
- name: kube-eventer
- command:
- - "/kube-eventer"
- - "--source=kubernetes:https://kubernetes.default"
- ## .e.g,dingtalk sink demo
- - --sink=webhook:https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxxxx&level=Warning&reason=[^Unhealthy]&namespaces=xxxx&header=Content-Type=application/json&custom_body_configmap=custom-webhook-body0&custom_body_configmap_namespace=xxxx&method=POST
- #- --sink=webhook:https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxxxx&level=Warning&reason=BackOff&namespaces=xxxx&header=Content-Type=application/json&custom_body_configmap=custom-webhook-body1&custom_body_configmap_namespace=xxxx&method=POST
- #- --sink=webhook:https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxxxx&level=Warning&reason=Failed&namespaces=xxxx&header=Content-Type=application/json&custom_body_configmap=custom-webhook-body2&custom_body_configmap_namespace=xxxxx&method=POST
- env:
- # If TZ is assigned, set the TZ value as the time zone
- - name: TZ
- value: "Asia/Shanghai"
- volumeMounts:
- - name: localtime
- mountPath: /etc/localtime
- readOnly: true
- - name: zoneinfo
- mountPath: /usr/share/zoneinfo
- readOnly: true
- resources:
- requests:
- cpu: 200m
- memory: 100Mi
- limits:
- cpu: 500m
- memory: 250Mi
- volumes:
- - name: localtime
- hostPath:
- path: /etc/localtime
- - name: zoneinfo
- hostPath:
- path: /usr/share/zoneinfo
- ---
- apiVersion: rbac.authorization.k8s.io/v1
- kind: ClusterRole
- metadata:
- name: kube-eventer
- rules:
- - apiGroups:
- - ""
- resources:
- - events
- - configmaps
- verbs:
- - get
- - list
- - watch
- ---
- apiVersion: rbac.authorization.k8s.io/v1
- kind: ClusterRoleBinding
- metadata:
- name: kube-eventer
- roleRef:
- apiGroup: rbac.authorization.k8s.io
- kind: ClusterRole
- name: kube-eventer
- subjects:
- - kind: ServiceAccount
- name: kube-eventer
- namespace: namespace
- ---
- apiVersion: v1
- kind: ServiceAccount
- metadata:
- name: kube-eventer
- namespace: namespace
- ---
- apiVersion: v1
- data:
- content: >-
- {"msgtype": "text","text": {"content": "Cluster:name\nEventType:{{ .Type }}\nEventNamespace:{{ .InvolvedObject.Namespace }}\nEventKind:{{ .InvolvedObject.Kind }}\nEventObject:{{ .InvolvedObject.Name }}\nEventReason:{{ .Reason }}\nEventTime:{{ .LastTimestamp }}\nEventMessage:{{ .Message }}","mentioned_list":["wangqing","@all"]}}
- kind: ConfigMap
- metadata:
- name: custom-webhook-body
- namespace: nameapce
這樣就可以完成向誰告警,誰進行處理的簡單分配。有了事件告警,可以及時發(fā)現(xiàn)服務(wù)問題與集群問題并進行修復。