Your IP...


Wednesday, August 24, 2011

Linux DRBD + Heartbeat Highavalability

This installation will describe DRBD with Primary and Secondary nodes including virtual IP (Floating IP)
You can find 192.168.1.200

Node1(Primary node)

eth0 192.168.1.201

eth1 10.0.0.201 (For drbd)

Node2 (Secondary node)

eth0 192.168.1.202

eth1 10.0.0.202 (For drbd)

add node details in /etc/hosts file, in both nodes

10.0.0.201 node1

10.0.0.202 node2


Install on both nodes

yum install drbd83 kmod-drbd83 heartbeat heartbeat-devel heartbeat-ldirectord

configure drbd.conf file in /etc on both nodes "data" is resource name. You can put what ever name you prefer. Put this file in both nodes,

global { usage-count no; }

resource data {

protocol C;

startup { wfc-timeout 0; degr-wfc-timeout 120; }

disk { on-io-error detach; } # or panic, ...

net { cram-hmac-alg "sha1"; shared-secret "Cent0Sru!3z"; } # don't forget to choose a secret for auth !

syncer { rate 10M; }

on node1 {

device /dev/drbd0;

disk /dev/sda3;

address 10.0.0.201:7788;

meta-disk internal;

}

on node2 {

device /dev/drbd0;

disk /dev/sda3;

address 10.0.0.202:7788;

meta-disk internal;

}

}

Initialize the meta-data area on disk before starting drbd. Run this command in both nodes.

drbdadm create-md data

Out put will be like this

md_offset 419483648

al_offset 419450880

bm_offset 419434496

Found ext3 filesystem

409601 kB data area apparently used

409604 kB left usable by current configuration

Even though it looks like this would place the new meta data into

unused space, you still need to confirm, as this is only a guess.

Do you want to proceed?

[need to type 'yes' to confirm] yes

Writing meta data...

initializing activity log

NOT initialized bitmap

New drbd meta data block successfully created.

Now you successfully created meta data.

Now stat drbd service in both nodes

service drbd start

Problem and solutions

Sometimes if your /dev/sda... (Your device file) if already mount this cause problem. Please make sure you unmount that device and add comment on /etc/fstab file. Other wise when PC reboot this /dev/sda will check for errors and sometimes PC is not boot up.

Starting DRBD resources: [

data

Found valid meta data in the expected location, 419483648 bytes into /dev/sda3.

d(data) 0: Failure: (114) Lower device is already claimed. This usually means it is mounted.

[data] cmd /sbin/drbdsetup 0 disk /dev/sda3 /dev/sda3 internal --set-defaults --create-device --on-io-error=detach failed - continuing!

s(data) n(data) ].

umount /dev/sda3

Comment your disk resource from /etc/fstab also.

Now try to start drbd service. If it's still gives you errors just reboot the both nodes. After reboot complete try with drbd status

[root@node1 ~]# cat /proc/drbd

version: 8.3.8 (api:88/proto:86-94)

GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16

0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----

ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:409604

Type this on primary node that you decide. In this case node1

drbdadm -- --overwrite-data-of-peer primary data

Check for the device sync

watch cat /proc/drbd

Once sync is complete you can mound /dev/drbd0

mkfs.ext3 /dev/drbd0

mke2fs 1.39 (29-May-2006)

warning: 3 blocks unused.

Filesystem label=

OS type: Linux

Block size=1024 (log=0)

Fragment size=1024 (log=0)

102800 inodes, 409601 blocks

20480 blocks (5.00%) reserved for the super user

First data block=1

Maximum filesystem blocks=67633152

50 block groups

8192 blocks per group, 8192 fragments per group

2056 inodes per group

Superblock backups stored on blocks:

8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409

Writing inode tables: done

Creating journal (8192 blocks): done

Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 25 mounts or

180 days, whichever comes first. Use tune2fs -c or -i to override.

mkdir /u

mount /dev/drbd0 /u

You can create test file and check.

for i in {1..5};do dd if=/dev/zero of=/u/file$i bs=1M count=10;done

Now for the node2 configuration you need to umount /u and set disk as secondary in node1

umount /u

drbdadm secondary data

Here you can see now node1 also as Secondary

watch -n 1 cat /proc/drbd

Every 1.0s: cat /proc/drbd Tue Aug 23 11:21:49 2011

version: 8.3.8 (api:88/proto:86-94)

GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16

0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r----

ns:482866 nr:0 dw:73262 dr:424875 al:60 bm:30 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

Now In node2 you can create /u folder and mount dev/drbd0

drbdadm primary data

mkdir /u

mount /dev/drbd0 /u

Delete files that create previously and add new files.

unmount from the node2 and set drbd as secondary

Goto node1 and set that as primary and mount. You can see all the files that we create from node2.

Now drbd is ok

add it to start up on both nodes

chkconfig drbd on

Configure heart beat.

goto node1 and /etc/ha.d

touch ha.cf

Add followings

logfile /var/log/ha-log

logfacility local0

keepalive 2

deadtime 30

initdead 120

bcast eth0

udpport 694

auto_failback on

node node1

node node2

crm no

touch authkeys

Add following

auth 1

1 sha1 MySecret

set permission on authkey

chmod 600 authkeys

touch haresources

In haresources add the Primary node , Floating IP and the services that you need to monitor through heartbeat (here httpd and smb) 192.168.1.200 mean floating IP.

node1 IPaddr::192.168.1.200 drbddisk::data Filesystem::/dev/drbd0::/u::ext3 httpd smb

Stop httpd and smb services in the both nodes. Because heartbeat will start the process and do the fail over part.

chkconfig httpd off

chkconfig smb off

Add heartbeat service to startup in both nodes.

chkconfig heartbeat on

Any issue with split brain ?

When it's come to split brain there are 02 node we can identified.

1. Split brain victim

2. Split brain survivor

Select on of your node and type,

drbdadm secondary data <----------------------------- Now this node act as victim

This issue can be solved,

1. To perform command this way to pass this option directly to drbdsetup: (EASY WAY) Select one node and set it as secondary.

drbdadm secondary data

drbdadm -- --discard-my-data connect data

Now go to other node (Split brain survivor) and type

drbdadm connect data

Now check on both node and it become normal status



Resources

http://wiki.centos.org/HowTos/Ha-Drbd

http://www.drbd.org/users-guide/ch-rhcs.html

http://supportex.net/2011/07/drbd-split-brain-solution-primaryprimary-setup/

http://almamunbd.wordpress.com/2009/05/28/how-to-configure-mysql-high-availability-with-drbd-and-heartbeat/


No comments: