当前位置: 首页 > 知识库问答 >
问题:

MariaDB Galera集群设置问题

喻渊
2023-03-14
[mariadb]
name = MariaDB
baseurl = http://yum.mariadb.org/5.5-galera/rhel6-amd64/
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1
[mariadb]
log_error=/var/log/mariadb.log
query_cache_size=0
query_cache_type=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address=gcomm://192.168.211.133
wsrep_cluster_name='cluster'
wsrep_node_address='192.168.211.132'
wsrep_node_name='cluster1'
wsrep_sst_method=rsync

在服务器2上

[mariadb]
log_error=/var/log/mariadb.log
query_cache_size=0
query_cache_type=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address=gcomm://192.168.211.132
wsrep_cluster_name='cluster'
wsrep_node_address='192.168.211.133'
wsrep_node_name='cluster2'
wsrep_sst_method=rsync

当我用以下命令启动服务器1时:sudo service mysql start--wsrep-new-cluster它启动得很好,如果我打开mysql并检查wsrep的状态,它会说一切都启动并运行,这很好,但是当我试图在第二台服务器上启动sudo service mysql时,我在错误日志中得到以下结果:

140609 14:47:55 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140609 14:47:56 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.i5qfm2' --pid-file='/var/lib/mysql/localhost.localdomain-recover.pid'
140609 14:47:57 mysqld_safe WSREP: Recovered position 85448d73-ebe8-11e3-9c20-fbc1995fee11:0
140609 14:47:57 [Note] WSREP: wsrep_start_position var submitted: '85448d73-ebe8-11e3-9c20-fbc1995fee11:0'
140609 14:47:57 [Note] WSREP: Read nil XID from storage engines, skipping position init
140609 14:47:57 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
140609 14:47:57 [Note] WSREP: wsrep_load(): Galera 25.3.2(r170) by Codership Oy <info@codership.com> loaded successfully.
140609 14:47:57 [Note] WSREP: CRC-32C: using hardware acceleration.
140609 14:47:57 [Note] WSREP: Found saved state: 85448d73-ebe8-11e3-9c20-fbc1995fee11:-1
140609 14:47:57 [Note] WSREP: Passing config to GCS: base_host = 192.168.211.133; base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.proto_max = 5
140609 14:47:57 [Note] WSREP: Assign initial position for certification: 0, protocol version: -1
140609 14:47:57 [Note] WSREP: wsrep_sst_grab()
140609 14:47:57 [Note] WSREP: Start replication
140609 14:47:57 [Note] WSREP: Setting initial position to 85448d73-ebe8-11e3-9c20-fbc1995fee11:0
140609 14:47:57 [Note] WSREP: protonet asio version 0
140609 14:47:57 [Note] WSREP: Using CRC-32C (optimized) for message checksums.
140609 14:47:57 [Note] WSREP: backend: asio
140609 14:47:57 [Note] WSREP: GMCast version 0
140609 14:47:57 [Note] WSREP: (0c085f34-efe5-11e3-9f6b-8bfd1706e2a4, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
140609 14:47:57 [Note] WSREP: (0c085f34-efe5-11e3-9f6b-8bfd1706e2a4, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
140609 14:47:57 [Note] WSREP: EVS version 0
140609 14:47:57 [Note] WSREP: PC version 0
140609 14:47:57 [Note] WSREP: gcomm: connecting to group 'cluster', peer '192.168.211.132:,192.168.211.134:'
140609 14:48:00 [Warning] WSREP: no nodes coming from prim view, prim not possible
140609 14:48:00 [Note] WSREP: view(view_id(NON_PRIM,0c085f34-efe5-11e3-9f6b-8bfd1706e2a4,1) memb {
        0c085f34-efe5-11e3-9f6b-8bfd1706e2a4,0
} joined {
} left {
} partitioned {
})
140609 14:48:01 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50775S), skipping check
140609 14:48:31 [Note] WSREP: view((empty))
140609 14:48:31 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
         at gcomm/src/pc.cpp:connect():141
140609 14:48:31 [ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():196: Failed to open backend connection: -110 (Connection timed out)
140609 14:48:31 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1291: Failed to open channel 'cluster' at 'gcomm://192.168.211.132,192.168.211.134': -110 (Connection timed out)
140609 14:48:31 [ERROR] WSREP: gcs connect failed: Connection timed out
140609 14:48:31 [ERROR] WSREP: wsrep::connect() failed: 7
140609 14:48:31 [ERROR] Aborting

140609 14:48:31 [Note] WSREP: Service disconnected.
140609 14:48:32 [Note] WSREP: Some threads may fail to exit.
140609 14:48:32 [Note] /usr/sbin/mysqld: Shutdown complete

140609 14:48:32 mysqld_safe mysqld from pid file /var/lib/mysql/localhost.localdomain.pid ended

我不知道为什么第二台服务器不能检测到集群正在启动和运行。这些机器可以很好地相互通信,我可以SSH从一个到另一个,他们可以互相ping。我试着删除galera缓存,试着降级我的mariadb galera,试着禁用SELinux,试着以不同的用户身份运行mysql服务,验证正确的端口是打开的,试着用不同的IP地址在两台不同的计算机上运行它们,等等。有人知道这里发生了什么吗?因为我已经搜索了3天试图修复这个问题,但没有解决方案似乎对我有用。

共有1个答案

叶稳
2023-03-14

以下是我如何解决类似的问题。

CentOS 7 W/MariaDB Galera 10.1。

Node2我看到了这个:

016-12-27 15:40:38 140703512762624 [Warning] WSREP: no nodes coming from prim view, prim not possible
service mysql start --wsrep-new-cluster
2016-12-27 15:44:08 140438853814528 [ERROR] WSREP: It may not be safe to bootstrap the cluster from this node. It was not the last one to leave the cluster and may not contain all the updates. To force cluster bootstrap with this node, edit the grastate.dat file manually and set safe_to_bootstrap to 1 .
service mysql start --wsrep-new-cluster
service mysql start

注意:这是在一个演示预生产环境中。在同时重新启动所有服务器,使所有工作正常后,我立即打破了它:P,但我知道没有写,而且数据库是同步的。如果您在produciton中,并且发生了这种情况,您可以使用以下命令来确定要在哪个节点上运行“new-cluster”,这类似于说,让我成为主要节点。

mysqld_safe --wsrep-recover

如果这是一个生产问题,我高度重视阅读本文,并在向损坏的客户机抛出命令之前创建一个备份W/CloneZilla!

https://www.percona.com/blog/2014/09/01/galera-replication-how-to-recover-a-pxc-cluster/

 类似资料:
  • 本章节将介绍如何设置本地节点群集,如何使其成为私有的,以及如何使你在eth-netstat网络上的节点协同工作来监控应用程序。作为网络集成测试(与网络/blockchain同步/消息传播等相关的问题,DAPP开发人员测试多块和多用户场景)的后端,完全可供你的ethereum网络是非常有用的。 我们假设您可以通过安装指南构建geth 设置多个节点 为了在本地运行多个ethereum节点,您必须确保:

  • 这些是我机器里的端口。tcp 0 0 0.0.0.0:8088 0.0.0.0:*侦听1001 50434 5765/Java tcp 0 0 0.0.0.0:*侦听1001 45587 5461/Java tcp 0 0 0.0.0.0:*侦听1001 45594 5461/Java tcp 0 0 0.0.0.0:*侦听1001 47365 5765/Java tcp 0 0 0.0.0.0:

  • 我有一个用例,我想建立一个Kafka集群,最初我有1个Kafka Broker(A)和1个Zookeeper节点。以下是我的疑问: > 在向集群添加新的Kafka Broker(B)时。代理A上存在的所有数据都会自动分发吗?如果不是,我需要做的是分发数据。 不,让我们假设情况以某种方式解决了!我的数据分布在两个代理上。现在由于一些维护问题,我想关闭服务器B。 如何将经纪商B的数据传输到已经存在的经

  • 我想建立一个多kafka集群,大约有3个zookeeper实例,每个集群中有3个kafka代理,每个kafka经纪人大约有5个主题和5个分区。有什么设置指南可以参考吗? PS:我可以找到带有多个Kafka代理的单个zookeeper实例的信息,但不能找到带有多个zookeeper实例的设置。

  • 我有一个主节点,它的ip是192.168.1.101,还有一个非主节点,它的ip是192.168.1.106。两者使用相同版本的ElasticSearch-1.2.0。 但是在我启动主节点和非主节点之后,我得到了以下信息: cluster.name:mycluster node.name:“node1” node.master:true node.data:true index.number_of

  • 我创建了一个AWS密钥对。 我在这里逐字逐句地遵循指示:https://aws.amazon.com/articles/4926593393724923 当我键入“aws emr创建集群——名称SparkCluster——ami版本3.2——实例类型m3.xlarge——实例计数3——ec2属性KeyName=MYKEY——应用程序名称=Hive——引导操作路径=s3://support.elas