方案优势:
Galera能够实现MySQL/MariaDB数据库的主主复制和多主复制等模式,这些复制模式都是同步进行的,同步时间非常短
每一个节点都可以同时写入和读取,当某一节点发生故障时,可自动从集群中自动剔除
HAProxy能提供负载均衡和故障判断等功能解决服务器系统存在的单点故障
Keepalived能提供客户端连接数据库时使用的虚拟IP地址(VIP)
关于HAProxy的负载均衡算法
轮询方式(roundrobin):不适合用于backend为web服务器的情况,因为session、cookie会话保持会出现轮询导致的随机切换,是处理速度最快的算法,但最大服务器数量不能超过4095
原地址方式(source):适合于backend为web服务器的情况,能够实现session、cookie会话保持
最少连接算法(leastconn):适合于服务器配置相同或者相近的情况,能最大限度的将工作负载平均分配到每一台服务器上,它适用于session较长的连接,如SQL、LDAP、TSE等,但不适合于session较短的连接,例如它不适合于http连接
权重算法(static-rr),根据服务器的权重(weights)轮流使用每一个服务器,有计划(根据权重)的去轮询,此种方式将占用较少的CPU资源,大约降低1%
除此之外还有可以根据请求的URI和请求的URI参数的uri算法和uri-param算法以及根据HTTP header内容决定是否轮询的hdr算法,还有rdp-cookie算法可以根据cookie决定请求发往那一台服务器。
测试结果
当其中的一个节点发生宕机或事故导致数据库关闭或网络中断时,haproxy的状态能及时的显示后端节点的连接状态并将错误的节点从服务器资源池中移除,当网络或宕机恢复时,只有数据库启动成功后才能使得haproxy的状态显示正常。
目前此方案已经应用于OpenStack云平台开发环境,提供active-active高可用服务。
存在或已知问题
采用MySQL/MariaDB+Galera方案的数据库集群将仅对Innodb有效,而且不再支持查询缓存
关于CentOS7 的时间同步问题,时间同步服务已经由ntpd更新为chrony,可用yum info chrony查看chrony的简介以及用man查看chrony的用法
关于CentOS7 的日志服务问题,日志服务已经由syslog更新(CentOS6开始)为journald(rsyslogd) ,其配置文件变为/etc/rsyslog.conf
在手册中提到“Log files can also be managed by the journald daemon – a component of systemd . The journald daemon captures Syslog messages, kernel log messages, initial RAM disk and early boot messages as well as messages written to standard output and standard error output of all services, indexes them and makes this available to the user.”,具体参见“Chapter 18, Viewing and Managing Log Files”,以及man rsyslogd.
启用日志支持:#syslog-->Rsyslog-->journald
# enable syslog for haproxy
sed -i 's/SYSLOGD_OPTIONS=""/SYSLOGD_OPTIONS="-r"/g' /etc/sysconfig/rsyslog
cat >/etc/rsyslog.d/haproxy.conf<
# Log haproxy(local2.*) stuff
\$ModLoad imudp
\$UDPServerRun 514
local2.* /var/log/haproxy.log
eof
chown -R --reference=/etc/rsyslog.d/listen.conf /etc/rsyslog.d/haproxy.conf
chcon -R --reference=/etc/rsyslog.d/listen.conf /etc/rsyslog.d/haproxy.conf
systemctl restart rsyslog.service
# enable syslog for keepalived
sed -i 's/KEEPALIVED_OPTIONS="-D"/KEEPALIVED_OPTIONS="-D -S 0"/g' /etc/sysconfig/keepalived
cat >/etc/rsyslog.d/keepalived.conf<
# Log keepalived(local0.*) stuff
\$ModLoad imudp
\$UDPServerRun 514
local0.* /var/log/keepalived.log
eof
chown -R --reference=/etc/rsyslog.d/listen.conf /etc/rsyslog.d/keepalived.conf
chcon -R --reference=/etc/rsyslog.d/listen.conf /etc/rsyslog.d/keepalived.conf
systemctl restart rsyslog.service
tips:
Galera的配置中第一台服务器的wsrep_cluster_address可以设置成“gcomm://”,而第二个节点的wsrep_cluster_address可以设置成“gcomm://第一个节点的IP地址”,第三个节点的wsrep_cluster_address可以设置成“gcomm://第二个节点的IP地址”,以此类推,但需要注意的是必须第n个节点先于第n+1个节点启动数据库,第n+1个数据库才能启动成功
Galera的配置中不要将gcomm://写成dumm://,dumm://仅用于测试用途
Galera的配置中wsrep_provider_options的与ssl相关的文件可以从此选项中移除掉或者将所有的节点都使用一套ssl文件,包括证书和key
HAProxy可以通过option mysql-check user dbuser检查后端服务器数据库的运行情况
HAProxy的最大连接数将决定整个集群的最大连接数,因此HAProxy的maxconn值应该设置成(后端服务器的数量*后端服务器所能承受的最大连接数*90%)*110%,其中90%和110%表示可承受负载的余量
keepalived可以采用互为主备的设计策略,关于互为主备关系的两个VIP的用途,猜测是一个VIP提供A服务,另一个VIP提供B服务,这样“解决”了其中一个节点长期处于备机状态的情况。
有时为了减少keepalived中主备之间的抢断,可以将主备设置成备备,但将其中一个备机设置成非抢断模式,这样可以防止出现master宕机恢复后的抢断情况,减少VIP的切换时间
HAProxy+keepalived的方案也可以用于http类型的传输协议,此时针对这单一情况还可以使用nginx+keepalived等负载均衡方案
附加:各个节点的配置文件:
节点1数据库配置文件[root@controllernode1 ~]# delsc /etc/my.cnf.d/galera.cnf
[mysqld]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
innodb_locks_unsafe_for_binlog=1
query_cache_size=0
query_cache_type=0
bind-address=0.0.0.0
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_provider_options="pc.wait_prim=no; pc.bootstrap=true;"
wsrep_cluster_name="wsrep_cluster"
wsrep_cluster_address="gcomm://"
wsrep_slave_threads=1
wsrep_certify_nonPK=1
wsrep_max_ws_rows=131072
wsrep_max_ws_size=1073741824
wsrep_debug=0
wsrep_convert_LOCK_to_trx=0
wsrep_retry_autocommit=1
wsrep_auto_increment_control=1
wsrep_drupal_282555_workaround=0
wsrep_causal_reads=0
wsrep_notify_cmd=
wsrep_sst_method=rsync
wsrep_sst_auth=root:password
[root@controllernode1 ~]#
节点2数据库配置文件
与节点1基本相同,只是wsrep_cluster_address的地址不一样。[root@controllernode2 ~]# delsc /etc/my.cnf.d/galera.cnf
[mysqld]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
innodb_locks_unsafe_for_binlog=1
query_cache_size=0
query_cache_type=0
bind-address=0.0.0.0
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_provider_options="pc.wait_prim=no; pc.bootstrap=true;"
wsrep_cluster_name="wsrep_cluster"
wsrep_cluster_address="gcomm://192.168.21.11"
wsrep_slave_threads=1
wsrep_certify_nonPK=1
wsrep_max_ws_rows=131072
wsrep_max_ws_size=1073741824
wsrep_debug=0
wsrep_convert_LOCK_to_trx=0
wsrep_retry_autocommit=1
wsrep_auto_increment_control=1
wsrep_drupal_282555_workaround=0
wsrep_causal_reads=0
wsrep_notify_cmd=
wsrep_sst_method=rsync
wsrep_sst_auth=root:wd2015
[root@controllernode2 ~]#
节点1HAProxy配置文件
[root@networknode1 ~]# cat /etc/haproxy/haproxy.cfg
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
stats socket /var/lib/haproxy/stats
defaults
#mode tcp
mode http
option httplog
log global
option dontlognull
option redispatch
option tcpka
retries 3
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout check 10s
timeout http-keep-alive 10s
maxconn 10000
listen stats
mode http
bind *:10000
stats enable
stats uri /haproxy
stats realm HAProxy\ Statistics
stats auth haproxy:password
listen mariadb
mode tcp
bind *:3306
balance leastconn
option mysql-check user haproxy
server controllernode1 192.168.21.11:3306 weight 1 check inter 2000 rise 2 fall 5
server controllernode2 192.168.21.12:3306 weight 1 check inter 2000 rise 2 fall 5
listen keystone_admin
mode http
bind *:35357
balance source
option tcpka
option httpchk
server controllernode1 192.168.21.11:35357 check inter 2000 rise 2 fall 5
server controllernode2 192.168.21.12:35357 check inter 2000 rise 2 fall 5
listen keystone_api
mode http
bind *:5000
balance source
option tcpka
option httpchk
server controllernode1 192.168.21.11:5000 check inter 2000 rise 2 fall 5
server controllernode2 192.168.21.12:5000 check inter 2000 rise 2 fall 5
listen swift_proxy_cluster
#mode http
mode tcp
bind *:8080
balance source
option tcpka
option tcplog
server swiftstoragenode1 192.168.21.11:8080 check inter 2000 rise 2 fall 5
server swiftstoragenode2 192.168.21.12:8080 check inter 2000 rise 2 fall 5
listen glance_api
mode http
bind *:9292
balance source
option tcpka
option httpchk
server controllernode1 192.168.21.11:9292 check inter 2000 rise 2 fall 5
server controllernode2 192.168.21.12:9292 check inter 2000 rise 2 fall 5
listen amqp_server
mode tcp
bind *:5672
option tcpka
balance source
server controllernode1 192.168.21.11:5672 check inter 2000 rise 2 fall 5
server controllernode2 192.168.21.12:5672 check inter 2000 rise 2 fall 5
listen nova_ec2
#mode http
mode tcp
bind *:8773
balance source
option tcpka
#option httpchk
maxconn 10000
server controllernode1 192.168.21.11:8773 check inter 2000 rise 2 fall 5
server controllernode2 192.168.21.12:8773 check inter 2000 rise 2 fall 5
listen nova_osapi
mode http
bind *:8774
balance source
option tcpka
option httpchk
maxconn 10000
server controllernode1 192.168.21.11:8774 check inter 2000 rise 2 fall 5
server controllernode2 192.168.21.12:8774 check inter 2000 rise 2 fall 5
listen nova_metadata
mode http
bind *:8775
balance source
option tcpka
option httpchk
maxconn 10000
server controllernode1 192.168.21.11:8775 check inter 2000 rise 2 fall 5
server controllernode2 192.168.21.12:8775 check inter 2000 rise 2 fall 5
listen novnc
mode http
bind *:6080
balance source
option tcpka
maxconn 10000
server controllernode1 192.168.21.11:6080 check inter 2000 rise 2 fall 5
server controllernode2 192.168.21.12:6080 check inter 2000 rise 2 fall 5
listen neutron_api
mode http
bind *:9696
balance source
option tcpka
maxconn 10000
server controllernode1 192.168.21.11:9696 check inter 2000 rise 2 fall 5
server controllernode2 192.168.21.12:9696 check inter 2000 rise 2 fall 5
listen dashboard
mode http
bind *:80
balance source
option tcpka
maxconn 10000
server controllernode1 192.168.21.11:80 check inter 2000 rise 2 fall 5
server controllernode2 192.168.21.12:80 check inter 2000 rise 2 fall 5
[root@networknode1 ~]#
节点2HAProxy配置文件:可以与节点1配置文件相同
节点1Keepalived配置文件
[root@networknode1 ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
dgdenterprise@gmail.com
}
notification_email_from root@controllernode1
smtp_server 127.0.0.1
smtp_connect_timeout 10
router_id openstackha_1
}
vrrp_sync_group VG_1 {
group {
VI_1
}
}
vrrp_instance VI_1 {
state BACKUP
interface em1
#use_vmac keepalived
#vmac_xmit_base
mcast_src_ip 192.168.21.21
virtual_router_id 20
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass password
}
virtual_ipaddress {
192.168.21.10
}
}
[root@networknode1 ~]#
节点2Keepalived配置文件
[root@networknode2 ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
dgdenterprise@gmail.com
}
notification_email_from root@controllernode2
smtp_server 127.0.0.1
smtp_connect_timeout 10
router_id openstackha_2
}
vrrp_sync_group VG_1 {
group {
VI_1
}
}
vrrp_instance VI_1 {
state BACKUP
interface em1
#use_vmac keepalived
#vmac_xmit_base
mcast_src_ip 192.168.21.22
virtual_router_id 20
priority 99
advert_int 1
authentication {
auth_type PASS
auth_pass password
}
virtual_ipaddress {
192.168.21.10
}
nopreempt}
[root@networknode2 ~]#
END