Centos5.6 x86下部署安裝DRBD+Heartbeat+MySQL
原創(chuàng)【51CTO獨(dú)家特稿】報(bào)錯(cuò)信息如下所示:
- Operation refused.
- Command 'drbdmeta 0 v08 /dev/sda2 internal create-md' terminated with exit code 40
- drbdadm create-md r0: exited with code 40
這個(gè)時(shí)候我們需要使用dd命令覆蓋文件系統(tǒng)中的設(shè)備塊信息,如下所示:
- dd if=/dev/zero of=/dev/sda2 bs=1M count=128
這個(gè)時(shí)候請(qǐng)一定要注意dd命令要清除的分區(qū)信息,不要搞錯(cuò)了,不然很容易將系統(tǒng)損壞,我第一次實(shí)驗(yàn)時(shí)就遇到了這個(gè)問(wèn)題;如果部署DRBD時(shí)的分區(qū)信息是不需要寫進(jìn)/etc/fstab表的,即不需要在安裝系統(tǒng)時(shí)就掛載,不然重啟系統(tǒng)時(shí)很容易發(fā)生Emergency信息,簽于以上情況,所以我這里建議大家還是用獨(dú)立硬盤作為DRBD的設(shè)備。
兩臺(tái)機(jī)器的基本情況如下所示:
centos1.cn7788.com 192.168.11.32
centos2.cn7788.com 192.168.11.33
Heartbeat的vip為 192.168.11.30
兩臺(tái)機(jī)器的hosts文件內(nèi)容如下所示:
192.168.11.32 centos1.cn7788.com centos1
192.168.11.33 centos2.cn7788.com centos2
實(shí)驗(yàn)中暫時(shí)先用千M交換機(jī)的網(wǎng)絡(luò)作為心跳線線路,等實(shí)驗(yàn)搭建成功后再考慮加雙絞線作為心跳線,兩臺(tái)機(jī)器的hostname及ntp對(duì)時(shí)這些在實(shí)驗(yàn)前就應(yīng)該配置好,iptables和SElinux關(guān)閉,具體情況略過(guò)。
一、DRBD的部署安裝
兩臺(tái)機(jī)器分別用如下命令來(lái)安裝drbd軟件,如下所示:
- yum -y install drbd83 kmod-drbd83
- modprobe drbd
- lsmod | grep drbd
正確顯示如下類似信息,表示DRBD已成功安裝:
- drbd 300440 4
兩臺(tái)機(jī)器的drbd.conf配置文件內(nèi)容如下所示(兩臺(tái)機(jī)器的配置是一樣的):
- cat /etc/drbd.conf
- global {
- # minor-count dialog-refresh disable-ip-verification
- usage-count no; #統(tǒng)計(jì)drbd的使用
- }
- common {
- syncer { rate 30M; } #同步速率,視帶寬而定
- }
- resource r0 { #創(chuàng)建一個(gè)資源,名字叫”r0”
- protocol C; #選擇的是drbd的C 協(xié)議(數(shù)據(jù)同步協(xié)議,C為收到數(shù)據(jù)并寫入后返回,確認(rèn)成功)
- handlers { #默認(rèn)drbd的庫(kù)文件
- pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
- pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
- local-io-error "/usr/lib/drbd/notify-io-error.sh;
- /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
- # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
- # split-brain "/usr/lib/drbd/notify-split-brain.sh root";
- # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
- # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
- # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
- }
- startup {
- # wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb
- wfc-timeout 120;
- degr-wfc-timeout 120;
- }
- disk {
- # on-io-error fencing use-bmbv no-disk-barrier no-disk-flushes
- # no-disk-drain no-md-flushes max-bio-bvecs
- on-io-error detach;
- }
- net {
- # sndbuf-size rcvbuf-size timeout connect-int ping-int ping-timeout max-buffers
- # max-epoch-size ko-count allow-two-primaries cram-hmac-alg shared-secret
- # after-sb-0pri after-sb-1pri after-sb-2pri data-integrity-alg no-tcp-cork
- max-buffers 2048;
- cram-hmac-alg "sha1";
- shared-secret "123456";
- #DRBD同步時(shí)使用的驗(yàn)證方式和密碼信息
- #allow-two-primaries;
- }
- syncer {
- rate 100M;
- # rate after al-extents use-rle cpu-mask verify-alg csums-alg
- }
- on centos1.cn7788.com { #設(shè)定一個(gè)節(jié)點(diǎn),分別以各自的主機(jī)名命名
- device /dev/drbd0; #設(shè)定資源設(shè)備/dev/drbd0 指向?qū)嶋H的物理分區(qū) /dev/sdb
- disk /dev/sdb;
- address 192.168.11.32:7788; #設(shè)定監(jiān)聽(tīng)地址以及端口
- meta-disk internal;
- }
- on centos2.cn7788.com { #設(shè)定一個(gè)節(jié)點(diǎn),分別以各自的主機(jī)名命名
- device /dev/drbd0; #設(shè)定資源設(shè)備/dev/drbd0 指向?qū)嶋H的物理分區(qū) /dev/sdb
- disk /dev/sdb;
- address 192.168.11.33:7788; #設(shè)定監(jiān)聽(tīng)地址以及端口
- meta-disk internal; #internal表示是在同一個(gè)局域網(wǎng)內(nèi)
- }
- }
1.創(chuàng)建DRBD元數(shù)據(jù)信息,執(zhí)行命令如下所示(兩臺(tái)機(jī)器都需要執(zhí)行此步):
- [root@centos1 ~]# drbdadm create-md r0
- md_offset 8589930496
- al_offset 8589897728
- bm_offset 8589635584
- Found some data
- ==> This might destroy existing data! <==
- Do you want to proceed?
- [need to type 'yes' to confirm] yes
- Writing meta data...
- initializing activity log
- NOT initialized bitmap
- New drbd meta data block successfully created.
- [root@centos2 ~]# drbdadm create-md r0
- md_offset 8589930496
- al_offset 8589897728
- bm_offset 8589635584
- Found some data
- ==> This might destroy existing data! <==
- Do you want to proceed?
- [need to type 'yes' to confirm] yes
- Writing meta data...
- initializing activity log
- NOT initialized bitmap
- New drbd meta data block successfully created.
2.啟動(dòng)DRBD設(shè)備,兩臺(tái)機(jī)器上分別執(zhí)行如下命令:
- service drbd start
3.在centos1的機(jī)器上我們查看DRBD狀態(tài),命令如下所示:
- [root@centos1 ~]# service drbd status
- drbd driver loaded OK; device status:
- version: 8.3.13 (api:88/proto:86-96)
- GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by mockbuild@builder10.centos.org, 2012-05-07 11:56:36
- m:res cs ro ds p mounted fstype
- 0:r0 Connected Secondary/Secondary Inconsistent/Inconsistent C
4.將centos1的機(jī)器作為DRBD的Primary機(jī)器,命令如下所示:
- [root@centos1 ~]# drbdsetup /dev/drbd0 primary -o
- [root@centos1 ~]# drbdadm primary r0
然后我們?cè)俨榭雌錉顟B(tài),命令如下所示:
- [root@centos1 ~]# service drbd status
- drbd driver loaded OK; device status:
- version: 8.3.13 (api:88/proto:86-96)
- GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by mockbuild@builder10.centos.org, 2012-05-07 11:56:36
- m:res cs ro ds p mounted fstype
- ... sync'ed: 3.9% (7876/8188)M
- 0:r0 SyncSource Primary/Secondary UpToDate/Inconsistent C
我們發(fā)現(xiàn),Primary/Secondary關(guān)系已形成,而且數(shù)據(jù)在進(jìn)行同步,已同步了3.9%,我們稍為等待段時(shí)間,再查看Primary機(jī)器的DRBD狀態(tài),如下所示:
- [root@centos1 ~]# service drbd status
- drbd driver loaded OK; device status:
- version: 8.3.13 (api:88/proto:86-96)
- GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by mockbuild@builder10.centos.org, 2012-05-07 11:56:36
- m:res cs ro ds p mounted fstype
- 0:r0 Connected Primary/Secondary UpToDate/UpToDate C
UpToDate/UpToDate表示數(shù)據(jù)已經(jīng)同步完成了。
5.在兩臺(tái)機(jī)器上都建立/drbd分區(qū),準(zhǔn)備將其作為MySQL的掛載目錄,命令如下所示:
- mkdir /drbd
6.格式化Primary機(jī)器的DRBD分區(qū)并掛載使用。
- mkfs.ext3 /dev/drbd0
- mount /dev/drbd0 /drbd/
注:Secondary節(jié)點(diǎn)上是不允許對(duì)DRBD設(shè)備進(jìn)行任何操作,包括只讀,所以的讀寫操作都只能在Primary節(jié)點(diǎn)上進(jìn)行,只有當(dāng)Primary節(jié)點(diǎn)掛掉時(shí),Secondary代替主節(jié)點(diǎn)作為Primary節(jié)點(diǎn)時(shí)才能進(jìn)行讀寫操作。
7.兩臺(tái)機(jī)器都將DRBD設(shè)為自啟動(dòng)服務(wù),命令如下:
chkconfig drbd on
二、Heartbeat的安裝和部署
1.兩臺(tái)機(jī)器上分別用yum來(lái)安裝heartbeat,如下命令操作二次:
- yum -y install heartbeat
如果你只操作一次,你會(huì)驚奇的發(fā)現(xiàn),heartbeat第一次時(shí)并沒(méi)有安裝成功。
2.兩個(gè)節(jié)點(diǎn)的heartbeat配置文件,分別如下所示:
centos1.cn7788.com的配置文件:
- logfile /var/log/ha-log
- #定義Heartbeat的日志名字及位置
- logfacility local0
- keepalive 2
- #設(shè)定心跳(監(jiān)測(cè))時(shí)間為2秒
- deadtime 15
- #設(shè)定死亡時(shí)間為15秒
- ucast eth0 192.168.11.33
- #采用單播的方式,IP地址指定為對(duì)方IP
- auto_failback off
- #當(dāng)Primary機(jī)器發(fā)生故障切換到Secondary機(jī)器后不再進(jìn)行切回操作
- node centos1.cn7788.com centos2.cn7788.com
centos2.cn7788.com的配置文件:
- logfile /var/log/ha-log
- logfacility local0
- keepalive 2
- deadtime 15
- ucast eth0 192.168.11.32
- auto_failback off
- node centos1.cn7788.com centos2.cn7788.com
3.編輯雙機(jī)互連驗(yàn)證文件authkeys,如下所示:
- cat /etc/ha.d/authkeys
- auth 1
- 1 crc
需要將此文件設(shè)定為600權(quán)限,不然啟動(dòng)heartbeat服務(wù)時(shí)會(huì)報(bào)錯(cuò),命令如下所示:
- chmod 600 /etc/ha.d/authkeys
4.編輯集群資源文件/etc/ha.d/haresource
- centos1.rogrand.com IPaddr::192.168.11.30/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0::/drbd::ext3 mysqld
這個(gè)文件在兩臺(tái)機(jī)器上都是一樣的,這個(gè)就不要輕易改動(dòng)了。
mysqld為mysql服務(wù)器啟動(dòng)、重啟及關(guān)閉腳本,這個(gè)是安裝MySQL自帶的,我們等會(huì)會(huì)在安裝MySQL提到此步。
三、源碼編譯安裝mysql5.1.47并部署haresource
1.在MySQL官方網(wǎng)站上下載mysql5.1.47的源碼包,在兩臺(tái)機(jī)器上分別安裝,具體步驟如下所示:
安裝gcc等基礎(chǔ)庫(kù)文件
- yum install gcc gcc-c++ zlib-devel libtool ncurses-devel libxml2-devel
生成mysql用戶及用戶組
- groupadd mysql
- useradd -g mysql mysql
源碼編譯安裝mysql5.1.47
- tar zxvf mysql-5.1.47.tar.gz
- cd mysql-5.1.47
- ./configure --prefix=/usr/local/mysql --with-charset=utf8 --with-extra-charsets=all --enable-thread-safe-client --enable-assembler --with-readline --with-big-tables --with-plugins=all --with-mysqld-ldflags=-all-static --with-client-ldflags=-all-static
- make
- make install
2.對(duì)mysql進(jìn)行權(quán)限配置,使其能順利啟動(dòng)。
- cd /usr/local/mysql
- cp /usr/local/mysql/share/mysql/my-medium.cnf /etc/my.cnf
- cp /usr/local/mysql/share/mysql/mysql.server /etc/init.d/mysqld
- cp /usr/local/mysql/share/mysql/mysql.server /etc/ha.d/resource.d/mysqld
- chmod +x /etc/init.d/mysqld
- chmod +x /etc/ha.d/resource.d/mysqld
- chown -R mysql:mysql /usr/local/mysql
3.兩臺(tái)機(jī)器上的/etc/my.cnf的[mysqld]項(xiàng)下面重新配置下mysql運(yùn)行時(shí)的數(shù)據(jù)存放路徑
- datadir=/drbd/data
4.在Primary機(jī)器上運(yùn)行如下命令,使其數(shù)據(jù)庫(kù)目錄生成數(shù)據(jù),Secondary機(jī)器不需要運(yùn)行此步。
- /usr/local/mysql/bin/mysql_install_db --user=mysql --datadir=/drbd/data
注意:這里是整個(gè)實(shí)驗(yàn)環(huán)境的一個(gè)重要環(huán)節(jié),我起初在搭建此步時(shí)出過(guò)幾次問(wèn)題,我們運(yùn)行MySQL是在啟動(dòng)DRBD設(shè)備之后,即正確將/dev/drbd0目錄正確掛載到/drbd目錄,而并非沒(méi)掛載就去啟動(dòng)MySQL,這會(huì)導(dǎo)致整個(gè)實(shí)驗(yàn)完全失敗,大家做到此步時(shí)請(qǐng)注意。做完這步以后,我們不需要啟動(dòng)MySQL,它可以靠腳本來(lái)啟動(dòng),如果已經(jīng)啟動(dòng)了MySQL請(qǐng)手動(dòng)關(guān)閉。
四、在兩臺(tái)機(jī)器上將DRBD和Heartbeat都設(shè)成自啟動(dòng)方式
service drbd start
chkcfonig drbd on
service heartbeat start
chkconfig heartbeat on
通過(guò)觀察Primary機(jī)器上的信息,我們得知,Primary機(jī)器已經(jīng)正確啟動(dòng)了MySQL和Heartbaet了,信息如下所示:
- [root@centos1 data]# ip addr
- 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
- link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
- inet 127.0.0.1/8 scope host lo
- 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
- link/ether 00:0c:29:48:2e:9f brd ff:ff:ff:ff:ff:ff
- inet 192.168.11.32/24 brd 192.168.11.255 scope global eth0
- inet 192.168.11.30/24 brd 192.168.11.255 scope global secondary eth0:0
通過(guò)查看到3306端口被占用情況,我們得知mysql服務(wù)已被正常開(kāi)啟。
- [root@centos1 data]# lsof -i:3306
- COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
- mysqld 4341 mysql 18u IPv4 9807 TCP *:mysql (LISTEN)
五、其余的工作其實(shí)就比較好測(cè)試了,我們主要是模擬Primary機(jī)器重啟或死機(jī)時(shí),看Secondary機(jī)器能不能自動(dòng)接管過(guò)來(lái)并啟動(dòng)MySQL,我們重啟Primary機(jī)器后在Secondary機(jī)器上觀察,命令如下所示:
- IPaddr[3050]: 2012/09/04_09:51:24 INFO: Resource is stopped
- ResourceManager[3023]: 2012/09/04_09:51:24 info: Running /etc/ha.d/resource.d/IPaddr 192.168.11.30/24/eth0 start
- IPaddr[3149]: 2012/09/04_09:51:25 INFO: Using calculated netmask for 192.168.11.30: 255.255.255.0
- IPaddr[3149]: 2012/09/04_09:51:26 INFO: eval ifconfig eth0:0 192.168.11.30 netmask 255.255.255.0 broadcast 192.168.11.255
- IPaddr[3119]: 2012/09/04_09:51:26 INFO: Success
- heartbeat[2842]: 2012/09/04_09:51:26 WARN: Late heartbeat: Node centos1.rogrand.com: interval 3510 ms
- ResourceManager[3023]: 2012/09/04_09:51:26 info: Running /etc/ha.d/resource.d/drbddisk r0 start
- Filesystem[3300]: 2012/09/04_09:51:27 INFO: Resource is stopped
- ResourceManager[3023]: 2012/09/04_09:51:27 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /drbd ext3 start
- Filesystem[3382]: 2012/09/04_09:51:28 INFO: Running start for /dev/drbd0 on /drbd
- Filesystem[3370]: 2012/09/04_09:51:28 INFO: Success
- ResourceManager[3023]: 2012/09/04_09:51:29 info: Running /etc/ha.d/resource.d/mysqld start
- mach_down[2997]: 2012/09/04_09:51:31 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
- mach_down[2997]: 2012/09/04_09:51:31 info: mach_down takeover complete for node centos1.rogrand.com.
- heartbeat[2842]: 2012/09/04_09:51:31 info: mach_down takeover complete.
- heartbeat[2842]: 2012/09/04_09:51:32 WARN: node centos1.rogrand.com: is dead
- heartbeat[2842]: 2012/09/04_09:51:32 info: Dead node centos1.rogrand.com gave up resources.
- heartbeat[2842]: 2012/09/04_09:51:35 info: Link centos1.rogrand.com:eth0 dead.
稍等片刻我們會(huì)發(fā)現(xiàn),Secondary機(jī)器會(huì)自動(dòng)接管VIP,并啟動(dòng)MySQL服務(wù),如下所示:
- [root@centos2 data]# ip addr
- 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
- link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
- inet 127.0.0.1/8 scope host lo
- 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
- link/ether 00:0c:29:34:ee:af brd ff:ff:ff:ff:ff:ff
- inet 192.168.11.33/24 brd 192.168.11.255 scope global eth0
- inet 192.168.11.30/24 brd 192.168.11.255 scope global secondary eth0:0
實(shí)施整個(gè)過(guò)程需要注意有以下幾點(diǎn):
一、Secondary主機(jī)用來(lái)做DRBD的硬盤可以跟Primar主機(jī)的不一樣大小,但請(qǐng)不要小于Primary主機(jī),以免發(fā)生數(shù)據(jù)丟失的現(xiàn)象;
二、服務(wù)器網(wǎng)卡及交換機(jī)我都推薦千M系列的,在測(cè)試中發(fā)現(xiàn)其同步速率介于100M-200M之間,這里采用官方的建議,以最小值的30%帶寬來(lái)設(shè)置rate速率,即100M*30%,大家也可根據(jù)自己的實(shí)際網(wǎng)絡(luò)環(huán)境來(lái)設(shè)定此值;
三、DRBD對(duì)網(wǎng)絡(luò)環(huán)境要求很高,建議用單獨(dú)的雙絞線來(lái)作為二臺(tái)主機(jī)之間的心跳線,如果條件允許,可以考慮用二根以上的心跳線;如果這個(gè)環(huán)節(jié)做得好,基本上腦裂的問(wèn)題是不存在的。其實(shí)整個(gè)實(shí)驗(yàn)初期都可以在同一網(wǎng)絡(luò)下實(shí)現(xiàn),后期再加心跳線也是可行的。
四、安裝Heartbeat時(shí)需要安裝二遍,即yum -y install heartbeat要執(zhí)行二次;
五、建議不要用根分區(qū)作為MySQL的datadir,不然你show database時(shí)會(huì)發(fā)現(xiàn)會(huì)出現(xiàn)名為#mysql50#lost+found的數(shù)據(jù)庫(kù),這也是我將MySQL的數(shù)據(jù)庫(kù)目錄設(shè)置成/drbd/data的原因。
六、就算發(fā)生腦裂的問(wèn)題,DRBD也不會(huì)丟失數(shù)據(jù)的,手動(dòng)解決就是;正因?yàn)镈RBD可靠,MySQL也推薦將其作為MySQL實(shí)現(xiàn)高可用方案之一。
七、MySQL的DRBD此方案不能達(dá)到毫秒級(jí)的切換速度,MyISAM引擎的表在系統(tǒng)宕機(jī)后需要很長(zhǎng)的修復(fù)時(shí)間,而且也有可能發(fā)生表?yè)p壞的情況,建議大家將所有除了系統(tǒng)表之外的所有表引擎改為InnoDB引擎。
【編輯推薦】
- MySQL源碼學(xué)習(xí):MDL字典鎖
- MySQL Cluster開(kāi)發(fā)環(huán)境簡(jiǎn)明部署
- MySQL的四種不同查詢的分析
- MySQL的四種不同查詢的分析
- MySQL內(nèi)存表的特性與使用介紹