准备工作搭建CentOS7,地址为:192.168.51.101(ceph1), 192.168.51.102(ceph2)
软件版本,corosync-2.4.5-7,pacemaker-1.1.23, crmsh-3.0.0
# 192.168.51.101 配置/etc/hosts
192.168.51.101 ceph1
192.168.51.102 ceph2
# 192.168.51.102 配置/etc/hosts
192.168.51.101 ceph1
192.168.51.102 ceph2
#192.168.51.101
[root@ceph1 ~]# ssh-keygen
[root@ceph1 ~]# ssh-copy-id -i /root/.ssh/id_rsa root@ceph2
# 192.168.51.102
[root@ceph2 ~]# ssh-keygen
[root@ceph2 ~]# ssh-copy-id -i /root/.ssh/id_rsa root@ceph1
# 192.168.51.101 同步硬件时间
[root@ceph1 ~]# hwclock -s
# 192.168.51.102 # 同步硬件时间
[root@ceph2 ~]# hwclock -s
1.安装corosync和pacemaker
# 192.168.51.101
[root@ceph1 ~]# yum install corosync pacemaker -y
# 192.168.51.102
[root@ceph2 ~]# yum install corosync pacemaker -y
2.配置corosync和pacemaker
# 192.168.51.101
[root@ceph1 ~]# cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf
/etc/corosync/corosync.conf配置内容如下:
# Please read the corosync.conf.5 manual page
totem {
version: 2
# crypto_cipher and crypto_hash: Used for mutual node authentication.
# If you choose to enable this, then do remember to create a shared
# secret with "corosync-keygen".
# enabling crypto_cipher, requires also enabling of crypto_hash.
crypto_cipher: none
crypto_hash: none
# interface: define at least one interface to communicate
# over. If you define more than one interface stanza, you must
# also set rrp_mode.
interface {
# Rings must be consecutively numbered, starting at 0.
ringnumber: 0
# This is normally the *network* address of the
# interface to bind to. This ensures that you can use
# identical instances of this configuration file
# across all your cluster nodes, without having to
# modify this option.
bindnetaddr: 192.168.51.0
# However, if you have multiple physical network
# interfaces configured for the same subnet, then the
# network address alone is not sufficient to identify
# the interface Corosync should bind to. In that case,
# configure the *host* address of the interface
# instead:
# bindnetaddr: 192.168.1.1
# When selecting a multicast address, consider RFC
# 2365 (which, among other things, specifies that
# 239.255.x.x addresses are left to the discretion of
# the network administrator). Do not reuse multicast
# addresses across multiple Corosync clusters sharing
# the same network.
mcastaddr: 239.255.1.1
# Corosync uses the port you specify here for UDP
# messaging, and also the immediately preceding
# port. Thus if you set this to 5405, Corosync sends
# messages over UDP ports 5405 and 5404.
mcastport: 5405
# Time-to-live for cluster communication packets. The
# number of hops (routers) that this ring will allow
# itself to pass. Note that multicast routing must be
# specifically enabled on most network routers.
ttl: 1
}
}
logging {
# Log the source file and line where messages are being
# generated. When in doubt, leave off. Potentially useful for
# debugging.
fileline: off
# Log to standard error. When in doubt, set to no. Useful when
# running in the foreground (when invoking "corosync -f")
to_stderr: no
# Log to a log file. When set to "no", the "logfile" option
# must not be set.
to_logfile: yes
logfile: /var/log/cluster/corosync.log
# Log to the system log daemon. When in doubt, set to yes.
to_syslog: yes
# Log debug messages (very verbose). When in doubt, leave off.
debug: off
# Log messages with time stamps. When in doubt, set to on
# (unless you are only logging to syslog, where double
# timestamps can be annoying).
timestamp: on
logger_subsys {
subsys: QUORUM
debug: off
}
}
quorum {
# Enable and configure quorum subsystem (default: off)
# see also corosync.conf.5 and votequorum.5
provider: corosync_votequorum
}
# 手动添加内容
nodelist {
node {
ring0_addr: ceph1
nodeid: 1
}
node {
ring0_addr: ceph2
nodeid: 2
}
}
【提示】/etc/corosync/corosync.conf配置中可以配置crypto_cipher(加密密码类型),crypto_hash(加密方式),如:
crypto_cipher: aes256,crypto_hash: sha1
配置加密后需要制作秘钥文件
[root@ceph1 ~]# corosync-keygen
[root@ceph1 ~]# ll /etc/corosync/authkey
3.同步配置到ceph2
[root@ceph1 ~]# scp -p /etc/corosync/authkey /etc/corosync/corosync.conf ceph2:/etc/corosync/
# 192.168.51.101
[root@ceph1 ~]# wget -P /etc/yum.repos.d/ http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/network:ha-clustering:Stable.repo
[root@ceph1 ~]# yum install -y crmsh
[root@ceph1 ~]# crm configure property stonith-enabled=false
ERROR: Warnings found during check: config may not be valid
Do you still want to commit (y/n)? y
# 192.168.51.102
[root@ceph2 ~]# wget -P /etc/yum.repos.d/ http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/network:ha-clustering:Stable.repo
[root@ceph2 ~]# yum install -y crmsh
[root@ceph2 ~]# crm configure property stonith-enabled=false
ERROR: Warnings found during check: config may not be valid
Do you still want to commit (y/n)? y
# 192.168.51.101
[root@ceph1 ~]# systemctl start corosync.service
[root@ceph1 ~]# systemctl start pacemaker.service
# 192.168.51.102
[root@ceph2 ~]# systemctl start corosync.service
[root@ceph2 ~]# systemctl start pacemaker.service
1.检查成员节点通知
# 192.168.51.101
[root@ceph1 ~]# grep TOTEM /var/log/cluster/corosync.log
Mar 22 11:31:50 [22805] ceph1 corosync notice [TOTEM ] Initializing transport (UDP/IP Multicast).
Mar 22 11:31:50 [22805] ceph1 corosync notice [TOTEM ] Initializing transmit/receive security (NSS) crypto: none hash: none
Mar 22 11:31:50 [22805] ceph1 corosync notice [TOTEM ] The network interface [192.168.51.101] is now up.
Mar 22 11:31:50 [22805] ceph1 corosync notice [TOTEM ] A new membership (192.168.51.101:312) was formed. Members joined: 1
Mar 22 11:31:50 [22805] ceph1 corosync notice [TOTEM ] A new membership (192.168.51.101:316) was formed. Members joined: 2
# 192.168.51.102
[root@ceph1 ~]# grep TOTEM /var/log/cluster/corosync.log
Mar 22 11:32:51 [20797] ceph2 corosync notice [TOTEM ] Initializing transport (UDP/IP Multicast).
Mar 22 11:32:51 [20797] ceph2 corosync notice [TOTEM ] Initializing transmit/receive security (NSS) crypto: none hash: none
Mar 22 11:32:51 [20797] ceph2 corosync notice [TOTEM ] The network interface [192.168.51.102] is now up.
Mar 22 11:32:51 [20797] ceph2 corosync notice [TOTEM ] A new membership (192.168.51.102:321) was formed. Members joined: 2
Mar 22 11:32:51 [20797] ceph2 corosync notice [TOTEM ] A new membership (192.168.51.101:325) was formed. Members joined: 1
2.检查节点初始化信息
# 192.168.51.101
[root@ceph1 ~]# corosync-cfgtool -s
Printing ring status.
Local node ID 1
RING ID 0
id = 192.168.51.101
status = ring 0 active with no faults
# 192.168.51.102
[root@ceph2 ~]# corosync-cfgtool -s
Printing ring status.
Local node ID 2
RING ID 0
id = 192.168.51.102
status = ring 0 active with no faults
3.检查集群成员关系及Quorum API
# 192.168.51.101
[root@ceph1 ~]# corosync-cmapctl | grep members
runtime.totem.pg.mrp.srp.members.1.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(192.168.51.101)
runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1.status (str) = joined
runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(192.168.51.102)
runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 2
runtime.totem.pg.mrp.srp.members.2.status (str) = joined
4.看DC节点所在节点/集群状态信息
# 192.168.51.101
[root@ceph1 ~]# crm_mon -1
Stack: corosync
Current DC: ceph1 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Tue Mar 22 11:39:34 2022
Last change: Mon Mar 21 18:19:27 2022 by root via crm_attribute on ceph1
2 nodes configured
0 resource instances configured
Online: [ ceph1 ceph2 ]
No active resources
#192.168.51.101
[root@ceph1 ~]# yum install -y httpd
[root@ceph1 ~]# systemctl start httpd
[root@ceph1 ~]# echo "<h1>corosync pacemaker on the openstack</h1>" >/var/www/html/index.html
# 192.168.51.102
[root@ceph2 ~]# yum install -y httpd
[root@ceph2 ~]# systemctl start httpd
[root@ceph2 ~]# echo "<h1>corosync pacemaker on the openstack</h1>" >/var/www/html/index.html
# 192.168.51.101
[root@ceph1 ~]# crm
crm(live)# status ##必须保证所有节点都上线,才执行那些命令
crm(live)# ra
crm(live)ra# list systemd
httpd
crm(live)ra# cd
crm(live)# configure
crm(live)configure#property no-quorum-policy=ignore //忽略集群中当节点数小于等于quorum,节点数将无法运行,默认是stop
crm(live)configure#property default-resource-stickiness=INFINITY //资源粘性配置,主节点故障恢复后不切回资源
添加资源
crm(live)configure# primitive webip ocf:heartbeat:IPaddr parms ip="192.168.51.110" nic="ens160" cidr_netmask="" broadcast="192.168.51.255" //定义webip资源
crm(live)configure# primitive webserver systemd:httpd op start timeout=100s op stop timeout=100s //定义webserver资源
crm(live)configure# group webservice webip webserver# 集群默认为资源平均分配,通过组使资源在同一个节点,注意顺序IP在哪儿,webserver就在哪儿
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node 1: ceph1 \
attributes standby=off
node 2: ceph2 \
attributes standby=off
primitive webip IPaddr \
params parms ip=192.168.51.110 nic=ens160 cidr_netmask="" broadcast=192.168.51.255
primitive webserver systemd:httpd \
op start timeout=100s interval=0 \
op stop timeout=100s interval=0
group webservice webip webserver
property cib-bootstrap-options: \
stonith-enabled=false \
have-watchdog=false \
dc-version=1.1.23-1.el7_9.1-9acf116022 \
cluster-infrastructure=corosync \
no-quorum-policy=ignore \
default-resource-stickiness=INFINITY
[root@ceph1 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:8c:57:cf brd ff:ff:ff:ff:ff:ff
inet 192.168.51.101/24 brd 192.168.51.255 scope global ens160
valid_lft forever preferred_lft forever
inet 192.168.51.110/24 brd 192.168.51.255 scope global secondary ens160
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:fe8c:57cf/64 scope link
valid_lft forever preferred_lft forever
[root@ceph1 ~]# curl 192.168.51.110
<h1>corosync pacemaker on the openstack</h1>
[root@ceph1 ~]# crm node standby
[root@ceph1 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:8c:57:cf brd ff:ff:ff:ff:ff:ff
inet 192.168.51.101/24 brd 192.168.51.255 scope global ens160
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:fe8c:57cf/64 scope link
valid_lft forever preferred_lft forever
[root@ceph1 ~]# service httpd status
Redirecting to /bin/systemctl status httpd.service
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; disabled; vendor preset: disabled)
Active: inactive (dead)
Docs: man:httpd(8)
man:apachectl(8)
Mar 22 11:32:48 ceph1 systemd[1]: Starting Cluster Controlled httpd...
Mar 22 11:32:48 ceph1 httpd[22958]: AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 192.168.51.101. Set the 'ServerName' directive globally to suppress this message
Mar 22 11:32:48 ceph1 systemd[1]: Started Cluster Controlled httpd.
Mar 22 13:51:43 ceph1 systemd[1]: Stopping The Apache HTTP Server...
Mar 22 13:51:46 ceph1 systemd[1]: Stopped The Apache HTTP Server.
Mar 22 13:54:09 ceph1 systemd[1]: Starting The Apache HTTP Server...
Mar 22 13:54:09 ceph1 httpd[23150]: AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 192.168.51.101. Set the 'ServerName' directive globally to suppress this message
Mar 22 13:54:09 ceph1 systemd[1]: Started The Apache HTTP Server.
Mar 22 13:54:14 ceph1 systemd[1]: Stopping The Apache HTTP Server...
Mar 22 13:54:15 ceph1 systemd[1]: Stopped The Apache HTTP Server.
[root@ceph1 ~]# curl 192.168.51.110
<h1>corosync pacemaker on the openstack</h1>
pacemaker有两种命令行工具,一种是pcs,一种是crmsh,crmsh使用的是crm相关命令,当前用的最多的是crm命令,故熟悉crmsh方式部署即可。