前言
这篇博客主要就是做一下笔记,用corosync+pacemaker实现并且完善drbd的高可用,然后实现一个高可用的MFS文件系统。MFS、DRBD的搭建前前两篇博客我写到。
MFS搭建地址:http://bluesrer.blog.51cto.com/10918251/1977549
DRBD搭建地址:http://bluesrer.blog.51cto.com/10918251/1977153
还有一点要写出来,先把drdb的磁盘挂载到MFS要再装的目录,然后在此磁盘安装FMS。这样DRBD磁盘挂到那DRBD服务器的任何一台可以直接启动MFSMASTER服务,然后在两台DRBD服务器都需要写一个启动脚本放在/etc/systemd/system目录下,让system来接管服务。
[root@node4 mfs]# cat /etc/systemd/system/mfsmaster.service [Unit] Description=mfs After=network.target [Service] Type=forking ExecStart=/usr/local/mfs/sbin/mfsmaster start ExecStop=/usr/local/mfs/sbin/mfsmaster stop PrivateTmp=true [Install] WantedBy=multi-user.target
正文
一、
1、环境部署
a、操作系统和虚拟化软件:CENTOS 7.3、vmworkstation
b、四台虚拟机以及IP、主机名:node1(10.0.0.31)、node2(10.0.0.32)、node3(10.0.0.5)、node4(10.0.0.6)
c、所需软件包: crmsh-2.3.2.tar、pacemaker pcs psmisc policycoreutils-python corosync
d、时间同步、主机名互相访问、是否使用仲裁设备(否)、防火墙规则、关闭selinux
2、node3、node4安装PCS、PACEMAKER、corosync、crmsh
[root@node3 ~]# yum install -y pacemaker pcs psmisc policycoreutils-python corosync [root@node4 ~]# yum install -y pacemaker pcs psmisc policycoreutils-python corosync ##安装crmsh [root@node3 src]# tar -xf crmsh-2.3.2.tar [root@node3 src]# cd crmsh-2.3.2/ [root@node3 crmsh-2.3.2]# python setup.py install ######到NODE4上同样的步骤安装CRMSH#############
3、启动PCS、注册主机、集群节点
[root@node3 crmsh-2.3.2]# systemctl start pcsd.service [root@node3 crmsh-2.3.2]# systemctl enable pcsd.service [root@node3 corosync]# echo 123456 | passwd --stdin hacluster ########################同样的操作在NODE4在执行一遍########################### ###注册启动cluster [root@node3 crmsh-2.3.2]# pcs cluster auth node4 node3 Username: hacluster Password: node3: Authorized node4: Authorized ##启动cluester 节点,指定集群名称 [root@node3 crmsh-2.3.2]# pcs cluster setup --name mycluster node3 node4 --force Destroying cluster on nodes: node3, node4... node4: Stopping Cluster (pacemaker)... node3: Stopping Cluster (pacemaker)... node4: Successfully destroyed cluster node3: Successfully destroyed cluster Sending 'pacemaker_remote authkey' to 'node3', 'node4' node4: successful distribution of the file 'pacemaker_remote authkey' node3: successful distribution of the file 'pacemaker_remote authkey' Sending cluster config files to the nodes... node3: Succeeded node4: Succeeded Synchronizing pcsd certificates on nodes node3, node4... node3: Success node4: Success Restarting pcsd on the nodes in order to reload the certificates... node4: Success node3: Success
4、到其中一个节点上查看自动生成corosync.conf配置
[root@node3 crmsh-2.3.2]# cat /etc/corosync/corosync.conf totem { version: 2 secauth: off cluster_name: mycluster transport: udpu } nodelist { node { ring0_addr: node3 nodeid: 1 } node { ring0_addr: node4 nodeid: 2 } } quorum { provider: corosync_votequorum two_node: 1 } logging { to_logfile: yes logfile: /var/log/cluster/corosync.log to_syslog: yes }
5、启动集群
[root@node3 crmsh-2.3.2]# pcs cluster start --all node4: Starting Cluster... node3: Starting Cluster... ##在启动集群同时,pacemaker、corosync也被顺带启动 [root@node3 crmsh-2.3.2]# ps -ef | grep corosync root 27199 1 1 23:07 ? 00:00:00 corosync root 27238 25265 0 23:07 pts/1 00:00:00 grep --color=auto corosync [root@node3 crmsh-2.3.2]# ps -ef | grep pacemaker root 27206 1 0 23:07 ? 00:00:00 /usr/sbin/pacemakerd -f haclust+ 27207 27206 1 23:07 ? 00:00:00 /usr/libexec/pacemaker/cib root 27208 27206 0 23:07 ? 00:00:00 /usr/libexec/pacemaker/stonithd root 27209 27206 0 23:07 ? 00:00:00 /usr/libexec/pacemaker/lrmd haclust+ 27210 27206 0 23:07 ? 00:00:00 /usr/libexec/pacemaker/attrd haclust+ 27211 27206 0 23:07 ? 00:00:00 /usr/libexec/pacemaker/pengine haclust+ 27212 27206 0 23:07 ? 00:00:00 /usr/libexec/pacemaker/crmd root 27249 25265 0 23:08 pts/1 00:00:00 grep --color=auto pacemaker
6、查看集群状态、信息
##状态显示为no faults即为正常状态 [root@node3 crmsh-2.3.2]# corosync-cfgtool -s Printing ring status. Local node ID 1 RING ID 0 id = 10.0.0.5 status = ring 0 active with no faults [root@node4 crmsh-2.3.2]# corosync-cfgtool -s Printing ring status. Local node ID 2 RING ID 0 id = 10.0.0.6 status = ring 0 active with no faults ##集群信息、pcs、pacemaker和corosync工作状态也能看见 [root@node4 crmsh-2.3.2]# pcs status Cluster name: mycluster WARNING: no stonith devices and stonith-enabled is not false Stack: corosync Current DC: node4 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum Last updated: Mon Oct 30 23:13:30 2017 Last change: Mon Oct 30 23:07:49 2017 by hacluster via crmd on node4 2 nodes configured 0 resources configured Online: [ node3 node4 ] No resources Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled
7、查看集群内是否存在错误
[root@node3 crmsh-2.3.2]# crm_verify -L -V error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity Errors found during check: config not valid ##错误显示没有启用STONITH resources,我么暂时不需要先先关闭此选项 [root@node3 crmsh-2.3.2]# pcs property set stonith-enabled=false [root@node3 crmsh-2.3.2]# crm_verify -L -V
8、用crm管理集群,接管各个服务
##在centos 7被接管的服务需要systemctl enable service要不然资源接管没有服务。 ##被接管的服务需要先关掉服务,mfsmaster要自己写一个服务管理脚本,放到/etc/systemd/system/ 下 [root@node3 system]systemctl stop drbd [root@node3 system]systemctl stop mfsmaster
##定义DRBD管理的资源,设置主从,“verify”是检查配置有无错误 crm(live)configure# primitive mfs_drbd ocf:linbit:drbd params drbd_resource=mfs op monitor role=Master interval=10 timeout=20 op monitor role=Slave interval=20 timeout=20 op start timeout=240 op stop timeout=100 crm(live)configure# verify crm(live)configure# ms ms_mfs_drbd mfs_drbd meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" crm(live)configure# verify ##定义挂载资源(文件系统),配置其参数,文件系统格式在drbd格式化/drbd1格式相对应 crm(live)configure# primitive mystore ocf:heartbeat:Filesystem params device=/dev/drbd1 directory=/usr/local/mfs fstype=xfs op start timeout=60 op stop timeout=60 crm(live)configure# verify crm(live)configure#colocation ms_mfs_drbd_with_mystore inf: mystore ms_mfs_drbd crm(live)configure# order ms_mfs_drbd_before_mystore Mandatory: ms_mfs_drbd:promote mystore:start
##配置mfs资源 crm(live)configure# primitive mfs systemd:mfsmaster op monitor timeout=100 interval=30 op start timeout=30 interval=0 op stop timeout=30 interval=0 crm(live)configure# colocation mfs_with_mystore inf: mfs mystore crm(live)configure# order mystor_befor_mfs Mandatory: mystore mfs crm(live)configure# verify WARNING: mfs: specified timeout 30 for start is smaller than the advised 100 WARNING: mfs: specified timeout 30 for stop is smaller than the advised 100 crm(live)configure# commit
####配置VIP crm(live)configure# primitive vip ocf:heartbeat:IPaddr params ip=10.0.0.200 crm(live)configure# colocation vip_with_msf inf: vip mfs crm(live)configure# verify crm(live)configure# commit
转载于:https://blog.51cto.com/bluesrer/1979475