詳解 Kubernetes 中的等待重試機(jī)制
Kubernetes 中有很多需要重試的地方,比如重啟 Pod、CSI 的 PVC 掛載重試等。出錯重試時通常都會等待一個指數(shù)增長的時間,本文就來解析這個等待重試的機(jī)制。
Pod 的 CrashLoopBackOff 狀態(tài)
經(jīng)常使用 Kubernetes 的朋友應(yīng)該對 CrashLoopBackOff 不陌生,這是一種較常見的 Pod 異常狀態(tài)。通常發(fā)生在 Pod 內(nèi)的進(jìn)程啟動失敗或意外退出(退出碼不為 0),而 Pod 的重啟策略為 OnFailure 或 Always,kubelet 重啟該 Pod 后。
該狀態(tài)表示 Pod 在運(yùn)行失敗不斷重啟的循環(huán)中,而 kubelet 每次重啟的時候都會等待指數(shù)級增長的時間。這個重啟等待時間就是通過 backoff 實(shí)現(xiàn)的,以下是相關(guān)代碼:
// If a container is still in backoff, the function will return a brief backoff error and
// a detailed error message.
func (m *kubeGenericRuntimeManager) doBackOff(pod *v1.Pod, container *v1.Container, podStatus *kubecontainer.PodStatus, backOff *flowcontrol.Backoff) (bool, string, error) {
var cStatus *kubecontainer.Status
for _, c := range podStatus.ContainerStatuses {
if c.Name == container.Name && c.State == kubecontainer.ContainerStateExited {
cStatus = c
break
}
}
if cStatus == nil {
returnfalse, "", nil
}
klog.V(3).InfoS("Checking backoff for container in pod", "containerName", container.Name, "pod", klog.KObj(pod))
// Use the finished time of the latest exited container as the start point to calculate whether to do back-off.
ts := cStatus.FinishedAt
// backOff requires a unique key to identify the container.
key := getStableKey(pod, container)
if backOff.IsInBackOffSince(key, ts) {
if containerRef, err := kubecontainer.GenerateContainerRef(pod, container); err == nil {
m.recorder.Eventf(containerRef, v1.EventTypeWarning, events.BackOffStartContainer,
fmt.Sprintf("Back-off restarting failed container %s in pod %s", container.Name, format.Pod(pod)))
}
err := fmt.Errorf("back-off %s restarting failed container=%s pod=%s", backOff.Get(key), container.Name, format.Pod(pod))
klog.V(3).InfoS("Back-off restarting failed container", "err", err.Error())
returntrue, err.Error(), kubecontainer.ErrCrashLoopBackOff
}
backOff.Next(key, ts)
returnfalse, "", nil
}
backoff 的用法
使用 backoff 的方法很簡單,只需要用到 .IsInBackOffSince 和 .Next 方法:
func startBackoff() {
backOff := flowcontrol.NewBackOff(5*time.Second, 60*time.Second)
backOffID := "test"
lastDo := time.Now()
t := time.NewTicker(1 * time.Second)
defer t.Stop()
forrange t.C {
if backOff.IsInBackOffSince(backOffID, lastDo) { // 判斷當(dāng)前是否應(yīng)該執(zhí)行
continue
}
fmt.Printf("doing work after %s\n", time.Now().Sub(lastDo))
backOff.Next(backOffID, time.Now()) // 標(biāo)記已經(jīng)執(zhí)行過了
lastDo = time.Now()
}
}
以上代碼的輸出結(jié)果:
doing work after 1.001035775s
doing work after 5.999162394s
doing work after 10.9999193s
doing work after 21.000754631s
doing work after 40.999154124s
...
也可以對特定 id 重新計(jì)時:
backOff.Reset(backOffID)
將所有 id 全部清除:
backOff.GC()
backoff 的實(shí)現(xiàn)原理
backoff 的實(shí)現(xiàn)就百來行代碼,短小精悍。主結(jié)構(gòu)體內(nèi)定義了每個 id 對應(yīng)的任務(wù)執(zhí)行時間和等待時間。
在記錄當(dāng)前執(zhí)行時間時,將等待時間設(shè)置為上一次等待時間乘 2,實(shí)現(xiàn)等待時間指數(shù)級增長的效果:
func (p *Backoff) Next(id string, eventTime time.Time) {
p.Lock()
defer p.Unlock()
entry, ok := p.perItemBackoff[id]
if !ok || hasExpired(eventTime, entry.lastUpdate, p.maxDuration) {
entry = p.initEntryUnsafe(id)
entry.backoff += p.jitter(entry.backoff)
} else {
delay := entry.backoff * 2 // exponential
delay += p.jitter(entry.backoff) // add some jitter to the delay
entry.backoff = min(delay, p.maxDuration)
}
entry.lastUpdate = p.Clock.Now()
}
判斷當(dāng)前是否需要執(zhí)行時,只需要判斷是否到了等待時間即可:
func (p *Backoff) IsInBackOffSince(id string, eventTime time.Time) bool {
p.RLock()
defer p.RUnlock()
entry, ok := p.perItemBackoff[id]
if !ok {
returnfalse
}
if hasExpired(eventTime, entry.lastUpdate, p.maxDuration) {
returnfalse
}
return p.Clock.Since(eventTime) < entry.backoff
}