自拍偷在线精品自拍偷,亚洲欧美中文日韩v在线观看不卡

<sub id="7appd"></sub>

<cite id="7appd"><rp id="7appd"><form id="7appd"></form></rp></cite>

^{<sub id="7appd"></sub>}<sub id="7appd"><p id="7appd"></p></sub>

AI.x社區(qū)

軟考社區(qū)

企業(yè)培訓

鴻蒙開發(fā)者社區(qū)

WOT技術大會

公眾號矩陣

移動端

視頻課免費課排行榜短視頻直播課軟考學堂

全部課程軟考華為認證廠商認證 IT技術 PMP項目管理免費題庫

文章資源問答課堂專欄直播

51CTO

鴻蒙開發(fā)者社區(qū)

51CTO技術棧

51CTO官微

51CTO學堂

51CTO博客

CTO訓練營

鴻蒙開發(fā)者社區(qū)訂閱號

51CTO軟考

51CTO學堂APP

51CTO學堂企業(yè)版APP

鴻蒙開發(fā)者社區(qū)視頻號

51CTO軟考題庫

賬號設置退出

prometheus告警問題分析

作者：華仔 2021-03-31 08:02:34

運維系統(tǒng)運維

最近運維prometheus的過程中發(fā)現(xiàn)，有的時候它應該發(fā)送告警，可實際卻沒有;有的時候，不該發(fā)送告警卻發(fā)送了;還有的時候，告警出現(xiàn)明顯的延遲。為了找出其中的具體原因，特地去查閱了一些資料，同時也參考了官網(wǎng)的相關資料。希望對大家在今后使用prometheus有所幫助。

今天來說一下我在使用prometheus過程中遇到的告警問題。

問題分析

最近運維prometheus的過程中發(fā)現(xiàn)，有的時候它應該發(fā)送告警，可實際卻沒有;有的時候，不該發(fā)送告警卻發(fā)送了;還有的時候，告警出現(xiàn)明顯的延遲。為了找出其中的具體原因，特地去查閱了一些資料，同時也參考了官網(wǎng)的相關資料。希望對大家在今后使用prometheus有所幫助。

先來看一下官網(wǎng)提供的prometheus和alertmanager的一些默認的重要配置。如下所示：

# promtheus 
global: 
  # How frequently to scrape targets by default. 從目標抓取監(jiān)控數(shù)據(jù)的間隔 
  [ scrape_interval: <duration> | default = 1m ] 
  # How long until a scrape request times out. 從目標住區(qū)數(shù)據(jù)的超時時間 
  [ scrape_timeout: <duration> | default = 10s ] 
  # How frequently to evaluate rules. 告警規(guī)則評估的時間間隔 
  [ evaluation_interval: <duration> | default = 1m ] 
# alertmanager 
# How long to initially wait to send a notification for a group 
# of alerts. Allows to wait for an inhibiting alert to arrive or collect 
# more initial alerts for the same group. (Usually ~0s to few minutes.) 
[ group_wait: <duration> | default = 30s ] # 初次發(fā)送告警的等待時間 
 
# How long to wait before sending a notification about new alerts that 
# are added to a group of alerts for which an initial notification has 
# already been sent. (Usually ~5m or more.) 
[ group_interval: <duration> | default = 5m ] 同一個組其他新發(fā)生的告警發(fā)送時間間隔 
 
# How long to wait before sending a notification again if it has already 
# been sent successfully for an alert. (Usually ~3h or more). 
[ repeat_interval: <duration> | default = 4h ] 重復發(fā)送同一個告警的時間間隔

通過上面的配置，我們來看一下整個告警的流程。通過流程去發(fā)現(xiàn)問題。

根據(jù)上圖以及配置來看，prometheus抓取數(shù)據(jù)后，根據(jù)告警規(guī)則計算，表達式為真時，進入pending狀態(tài)，當持續(xù)時間超過for配置的時間后進入active狀態(tài);數(shù)據(jù)同時會推送至alertmanager，在經(jīng)過group_wait后發(fā)送通知。

告警延遲或頻發(fā)

根據(jù)整個告警流程來看，在數(shù)據(jù)到達alertmanager后，如果group_wait設置越大，則收到告警的時間也就越長，也就會造成告警延遲;同理，如果group_wait設置過小，則頻繁收到告警。因此，需要按照具體場景進行設置。

不該告警的時候告警了

prometheus每經(jīng)過scrape_interval時間向target拉取數(shù)據(jù)，再進行計算。與此同時，target的數(shù)據(jù)可能已經(jīng)恢復正常了，也就是說，在for計算過程中，原數(shù)據(jù)已經(jīng)恢復了正常，但是被告警跳過了，達到了持續(xù)時間，就觸發(fā)了告警，也就發(fā)送了告警通知。但從grafana中看，認為數(shù)據(jù)正常，不應發(fā)送告警。這是因為grafana以prometheus為數(shù)據(jù)源時，是range query，而不是像告警數(shù)據(jù)那樣稀疏的。

責任編輯：姜華來源：運維開發(fā)故事

Prometheus 監(jiān)控運維

51CTO技術棧公眾號

業(yè)務
速覽

媒體

51CTO CIOAge HC3i

社區(qū)

51CTO博客鴻蒙開發(fā)者社區(qū) AI.x社區(qū)

教育

51CTO學堂精培企業(yè)培訓 CTO訓練營

<sub id="iqcgr"></sub>

<cite id="iqcgr"></cite>