自拍偷在线精品自拍偷,亚洲欧美中文日韩v在线观看不卡

<ol id="v46i1"></ol>

<cite id="v46i1"></cite>

AI.x社區(qū)

軟考社區(qū)

免費(fèi)課

企業(yè)培訓(xùn)

鴻蒙開發(fā)者社區(qū)

WOT技術(shù)大會(huì)

公眾號(hào)矩陣

移動(dòng)端

視頻課免費(fèi)課排行榜短視頻直播課軟考學(xué)堂

全部課程軟考華為認(rèn)證廠商認(rèn)證 IT技術(shù)PMP項(xiàng)目管理免費(fèi)題庫

在線學(xué)習(xí)

文章資源問答課堂專欄直播

51CTO

鴻蒙開發(fā)者社區(qū)

51CTO技術(shù)棧

51CTO官微

51CTO學(xué)堂

51CTO博客

CTO訓(xùn)練營(yíng)

鴻蒙開發(fā)者社區(qū)訂閱號(hào)

51CTO軟考

51CTO學(xué)堂APP

51CTO學(xué)堂企業(yè)版APP

鴻蒙開發(fā)者社區(qū)視頻號(hào)

51CTO軟考題庫

賬號(hào)設(shè)置退出

Ceph集群磁盤無剩余空間的解決方法

2015-05-13 09:57:47

OpenStack + Ceph集群在使用過程中，由于虛擬機(jī)拷入大量新的數(shù)據(jù)，導(dǎo)致集群的磁盤迅速消耗，沒有空余空間，虛擬機(jī)無法操作，Ceph集群所有操作都無法執(zhí)行。本文提出兩種解決辦法，供大家參考。

故障描述

OpenStack + Ceph集群在使用過程中，由于虛擬機(jī)拷入大量新的數(shù)據(jù)，導(dǎo)致集群的磁盤迅速消耗，沒有空余空間，虛擬機(jī)無法操作，Ceph集群所有操作都無法執(zhí)行。

故障現(xiàn)象

嘗試使用OpenStack重啟虛擬機(jī)無效
嘗試直接用rbd命令直接刪除塊失敗

[root@controller ~]# rbd -p volumes rm volume-c55fd052-212d-4107-a2ac-cf53bfc049be 
2015-04-29 05:31:31.719478 7f5fb82f7760  0 client.4781741.objecter  FULL, paused modify 0xe9a9e0 tid 6

查看ceph健康狀態(tài)

cluster 059f27e8-a23f-4587-9033-3e3679d03b31 
 health HEALTH_ERR 20 pgs backfill_toofull; 20 pgs degraded; 20 pgs stuck unclean; recovery 7482/129081 objects degraded (5.796%); 2 full osd(s); 1 near full osd(s) 
 monmap e6: 4 mons at {node-5e40.cloud.com=10.10.20.40:6789/0,node-6670.cloud.com=10.10.20.31:6789/0,node-66c4.cloud.com=10.10.20.36:6789/0,node-fb27.cloud.com=10.10.20.41:6789/0}, election epoch 886, quorum 0,1,2,3 node-6670.cloud.com,node-66c4.cloud.com,node-5e40.cloud.com,node-fb27.cloud.com 
 osdmap e2743: 3 osds: 3 up, 3 in 
        flags full 
  pgmap v6564199: 320 pgs, 4 pools, 262 GB data, 43027 objects 
        786 GB used, 47785 MB / 833 GB avail 
        7482/129081 objects degraded (5.796%) 
             300 active+clean 
              20 active+degraded+remapped+backfill_toofull

HEALTH_ERR 20 pgs backfill_toofull; 20 pgs degraded; 20 pgs stuck unclean; recovery 7482/129081 objects degraded (5.796%); 2 full osd(s); 1 near full osd(s) 
pg 3.8 is stuck unclean for 7067109.597691, current state active+degraded+remapped+backfill_toofull, last acting [2,0] 
pg 3.7d is stuck unclean for 1852078.505139, current state active+degraded+remapped+backfill_toofull, last acting [2,0] 
pg 3.21 is stuck unclean for 7072842.637848, current state active+degraded+remapped+backfill_toofull, last acting [0,2] 
pg 3.22 is stuck unclean for 7070880.213397, current state active+degraded+remapped+backfill_toofull, last acting [0,2] 
pg 3.a is stuck unclean for 7067057.863562, current state active+degraded+remapped+backfill_toofull, last acting [2,0] 
pg 3.7f is stuck unclean for 7067122.493746, current state active+degraded+remapped+backfill_toofull, last acting [0,2] 
pg 3.5 is stuck unclean for 7067088.369629, current state active+degraded+remapped+backfill_toofull, last acting [2,0] 
pg 3.1e is stuck unclean for 7073386.246281, current state active+degraded+remapped+backfill_toofull, last acting [0,2] 
pg 3.19 is stuck unclean for 7068035.310269, current state active+degraded+remapped+backfill_toofull, last acting [0,2] 
pg 3.5d is stuck unclean for 1852078.505949, current state active+degraded+remapped+backfill_toofull, last acting [2,0] 
pg 3.1a is stuck unclean for 7067088.429544, current state active+degraded+remapped+backfill_toofull, last acting [2,0] 
pg 3.1b is stuck unclean for 7072773.771385, current state active+degraded+remapped+backfill_toofull, last acting [0,2] 
pg 3.3 is stuck unclean for 7067057.864514, current state active+degraded+remapped+backfill_toofull, last acting [2,0] 
pg 3.15 is stuck unclean for 7067088.825483, current state active+degraded+remapped+backfill_toofull, last acting [2,0] 
pg 3.11 is stuck unclean for 7067057.862408, current state active+degraded+remapped+backfill_toofull, last acting [2,0] 
pg 3.6d is stuck unclean for 7067083.634454, current state active+degraded+remapped+backfill_toofull, last acting [2,0] 
pg 3.6e is stuck unclean for 7067098.452576, current state active+degraded+remapped+backfill_toofull, last acting [2,0] 
pg 3.c is stuck unclean for 5658116.678331, current state active+degraded+remapped+backfill_toofull, last acting [2,0] 
pg 3.e is stuck unclean for 7067078.646953, current state active+degraded+remapped+backfill_toofull, last acting [2,0] 
pg 3.20 is stuck unclean for 7067140.530849, current state active+degraded+remapped+backfill_toofull, last acting [0,2] 
pg 3.7d is active+degraded+remapped+backfill_toofull, acting [2,0] 
pg 3.7f is active+degraded+remapped+backfill_toofull, acting [0,2] 
pg 3.6d is active+degraded+remapped+backfill_toofull, acting [2,0] 
pg 3.6e is active+degraded+remapped+backfill_toofull, acting [2,0] 
pg 3.5d is active+degraded+remapped+backfill_toofull, acting [2,0] 
pg 3.20 is active+degraded+remapped+backfill_toofull, acting [0,2] 
pg 3.21 is active+degraded+remapped+backfill_toofull, acting [0,2] 
pg 3.22 is active+degraded+remapped+backfill_toofull, acting [0,2] 
pg 3.1e is active+degraded+remapped+backfill_toofull, acting [0,2] 
pg 3.19 is active+degraded+remapped+backfill_toofull, acting [0,2] 
pg 3.1a is active+degraded+remapped+backfill_toofull, acting [2,0] 
pg 3.1b is active+degraded+remapped+backfill_toofull, acting [0,2] 
pg 3.15 is active+degraded+remapped+backfill_toofull, acting [2,0] 
pg 3.11 is active+degraded+remapped+backfill_toofull, acting [2,0] 
pg 3.c is active+degraded+remapped+backfill_toofull, acting [2,0] 
pg 3.e is active+degraded+remapped+backfill_toofull, acting [2,0] 
pg 3.8 is active+degraded+remapped+backfill_toofull, acting [2,0] 
pg 3.a is active+degraded+remapped+backfill_toofull, acting [2,0] 
pg 3.5 is active+degraded+remapped+backfill_toofull, acting [2,0] 
pg 3.3 is active+degraded+remapped+backfill_toofull, acting [2,0] 
recovery 7482/129081 objects degraded (5.796%) 
osd.0 is full at 95% 
osd.2 is full at 95% 
osd.1 is near full at 93%

#p#

解決方案一(已驗(yàn)證)

增加OSD節(jié)點(diǎn)，這也是官方文檔中推薦的做法，增加新的節(jié)點(diǎn)后，Ceph開始重新平衡數(shù)據(jù)，OSD使用空間開始下降

2015-04-29 06:51:58.623262 osd.1 [WRN] OSD near full (91%) 
2015-04-29 06:52:01.500813 osd.2 [WRN] OSD near full (92%)

解決方案二(理論上，沒有進(jìn)行驗(yàn)證)

如果在沒有新的硬盤的情況下，只能采用另外一種方式。在當(dāng)前狀態(tài)下，Ceph不允許任何的讀寫操作，所以此時(shí)任何的Ceph命令都不好使，解決的方案就是嘗試降低Ceph對(duì)于full的比例定義，我們從上面的日志中可以看到Ceph的full的比例為95%，我們需要做的就是提高full的比例，之后盡快嘗試刪除數(shù)據(jù)，將比例下降。

嘗試直接用命令設(shè)置，但是失敗了，Ceph集群并沒有重新同步數(shù)據(jù)，懷疑可能仍然需要重啟服務(wù)本身

ceph mon tell \* injectargs '--mon-osd-full-ratio 0.98'

修改配置文件，之后重啟monitor服務(wù)，但是擔(dān)心出問題，所以沒有敢嘗試該方法，后續(xù)經(jīng)過在郵件列表確認(rèn)，該方法應(yīng)該不會(huì)對(duì)數(shù)據(jù)產(chǎn)生影響，但是前提是在恢復(fù)期間，所有的虛擬機(jī)不要向Ceph再寫入任何數(shù)據(jù)。

默認(rèn)情況下full的比例是95%，而near full的比例是85%，所以需要根據(jù)實(shí)際情況對(duì)該配置進(jìn)行調(diào)整。

[global] 
    mon osd full ratio = .98 
    mon osd nearfull ratio = .80

分析總結(jié)

原因

根據(jù)Ceph官方文檔中的描述，當(dāng)一個(gè)OSD full比例達(dá)到95%時(shí)，集群將不接受任何Ceph Client端的讀寫數(shù)據(jù)的請(qǐng)求。所以導(dǎo)致虛擬機(jī)在重啟時(shí)，無法啟動(dòng)的情況。

解決方法

從官方的推薦來看，應(yīng)該比較支持添加新的OSD的方式，當(dāng)然臨時(shí)的提高比例是一個(gè)解決方案，但是并不推薦，因?yàn)樾枰謩?dòng)的刪除數(shù)據(jù)去解決，而且一旦再有一個(gè)新的節(jié)點(diǎn)出現(xiàn)故障，仍然會(huì)出現(xiàn)比例變滿的狀況，所以解決之道***是擴(kuò)容。

思考

在這次故障過程中，有兩點(diǎn)是值得思考的：

監(jiān)控：由于當(dāng)時(shí)服務(wù)器在配置過程中DNS配置錯(cuò)誤，導(dǎo)致監(jiān)控郵件無法正常發(fā)出，從而沒有收到Ceph WARN的提示信息
云平臺(tái)本身：由于Ceph的機(jī)制，在OpenStack平臺(tái)中分配中，大多時(shí)候是超分的，從用戶角度看，拷貝大量數(shù)據(jù)的行為并沒有不妥之處，但是由于云平臺(tái)并沒有相應(yīng)的預(yù)警機(jī)制，導(dǎo)致了該問題的發(fā)生

參考文檔

http://ceph.com/docs/master/rados/configuration/mon-config-ref/#storage-capacity

博文出處：http://blog.csdn.net/xiaoquqi/article/details/45539847

責(zé)任編輯：Ophira 來源： RaySun的博客

OpenStack DevOps Ceph集群

點(diǎn)贊

51CTO技術(shù)棧公眾號(hào)

業(yè)務(wù)
速覽

媒體

51CTO CIOAge HC3i

社區(qū)

51CTO博客鴻蒙開發(fā)者社區(qū) AI.x社區(qū)

教育

51CTO學(xué)堂精培企業(yè)培訓(xùn) CTO訓(xùn)練營(yíng)

<style id="ykabr"><rp id="ykabr"></rp></style>

<style id="ykabr"><kbd id="ykabr"></kbd></style>