当前位置: 首页 > 知识库问答 >
问题:

故障转移在静态群集HA中不起作用

沈骞仕
2023-03-14
    null

当我使用Ctrl-C停止活动服务器时,从服务器报告的到本地主机/127.0.0.1:61616的连接失败被检测到:AMQ219015:由于服务器关闭[code=disconnected]而断开连接,这是正确的,但是备份服务器没有更改其状态,也没有侦听端口61617。那么我在配置中做错了什么呢?

实时服务器配置:

<?xml version='1.0'?>
<configuration xmlns="urn:activemq"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xmlns:xi="http://www.w3.org/2001/XInclude"
               xsi:schemaLocation="urn:activemq /schema/artemis-configuration.xsd">

   <core xmlns="urn:activemq:core" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="urn:activemq:core ">

      <name>0.0.0.0</name>

      <persistence-enabled>true</persistence-enabled>

      <!-- this could be ASYNCIO, MAPPED, NIO
           ASYNCIO: Linux Libaio
           MAPPED: mmap files
           NIO: Plain Java Files
       -->

      <paging-directory>data/paging</paging-directory>

      <bindings-directory>data/bindings</bindings-directory>

      <journal-directory>data/journal</journal-directory>

      <large-messages-directory>data/large-messages</large-messages-directory>

      <security-enabled>false</security-enabled>

      <connectors>
         <connector name="netty-connector">tcp://localhost:61616</connector>
         <!-- connector to the server1 -->
         <connector name="server1-connector">tcp://localhost:61617</connector>
      </connectors>

      <!-- Acceptors -->
      <acceptors>
         <acceptor name="netty-acceptor">tcp://localhost:61616</acceptor>
      </acceptors>

      <ha-policy>
         <shared-store>
            <master>
               <check-for-live-server>true</check-for-live-server>
            </master>   
         </shared-store>
      </ha-policy>

      <cluster-connections>
         <cluster-connection name="my-cluster">
            <connector-ref>netty-connector</connector-ref>
            <retry-interval>500</retry-interval>
            <use-duplicate-detection>true</use-duplicate-detection>
            <message-load-balancing>STRICT</message-load-balancing>

            <max-hops>1</max-hops>

            <static-connectors>
               <connector-ref>server1-connector</connector-ref>
            </static-connectors>
         </cluster-connection>
      </cluster-connections>

      <security-settings>
          <security-setting match="#">
            <permission type="createNonDurableQueue" roles="amq"/>
            <permission type="deleteNonDurableQueue" roles="amq"/>
            <permission type="createDurableQueue" roles="amq"/>
            <permission type="deleteDurableQueue" roles="amq"/>
            <permission type="createAddress" roles="amq"/>
            <permission type="deleteAddress" roles="amq"/>
            <permission type="consume" roles="amq"/>
            <permission type="browse" roles="amq"/>
            <permission type="send" roles="amq"/>
            <!-- we need this otherwise ./artemis data imp wouldn't work -->
            <permission type="manage" roles="amq"/>
         </security-setting>
         <security-setting match="clusterTopic">
            <permission type="createNonDurableQueue" roles="amq"/>
            <permission type="deleteNonDurableQueue" roles="amq"/>
            <permission type="createDurableQueue" roles="amq"/>
            <permission type="deleteDurableQueue" roles="amq"/>
            <permission type="createAddress" roles="amq"/>
            <permission type="deleteAddress" roles="amq"/>
            <permission type="consume" roles="amq"/>
            <permission type="browse" roles="amq"/>
            <permission type="send" roles="amq"/>
            <!-- we need this otherwise ./artemis data imp wouldn't work -->
            <permission type="manage" roles="amq"/>
         </security-setting>
         <security-setting match="clusterQueue">
            <permission type="createNonDurableQueue" roles="amq"/>
            <permission type="deleteNonDurableQueue" roles="amq"/>
            <permission type="createDurableQueue" roles="amq"/>
            <permission type="deleteDurableQueue" roles="amq"/>
            <permission type="createAddress" roles="amq"/>
            <permission type="deleteAddress" roles="amq"/>
            <permission type="consume" roles="amq"/>
            <permission type="browse" roles="amq"/>
            <permission type="send" roles="amq"/>
            <!-- we need this otherwise ./artemis data imp wouldn't work -->
            <permission type="manage" roles="amq"/>
         </security-setting>
      </security-settings>

      <address-settings>
         <!-- if you define auto-create on certain queues, management has to be auto-create -->
         <address-setting match="activemq.management#">
            <dead-letter-address>DLQ</dead-letter-address>
            <expiry-address>ExpiryQueue</expiry-address>
            <redelivery-delay>0</redelivery-delay>
            <!-- with -1 only the global-max-size is in use for limiting -->
            <max-size-bytes>-1</max-size-bytes>
            <message-counter-history-day-limit>10</message-counter-history-day-limit>
            <address-full-policy>PAGE</address-full-policy>
            <auto-create-queues>true</auto-create-queues>
            <auto-create-addresses>true</auto-create-addresses>
            <auto-create-jms-queues>true</auto-create-jms-queues>
            <auto-create-jms-topics>true</auto-create-jms-topics>
         </address-setting>
         <!--default for catch all-->
         <address-setting match="#">
            <dead-letter-address>DLQ</dead-letter-address>
            <expiry-address>ExpiryQueue</expiry-address>
            <redelivery-delay>0</redelivery-delay>
            <!-- with -1 only the global-max-size is in use for limiting -->
            <max-size-bytes>-1</max-size-bytes>
            <message-counter-history-day-limit>10</message-counter-history-day-limit>
            <address-full-policy>PAGE</address-full-policy>
            <auto-create-queues>true</auto-create-queues>
            <auto-create-addresses>true</auto-create-addresses>
            <auto-create-jms-queues>true</auto-create-jms-queues>
            <auto-create-jms-topics>true</auto-create-jms-topics>
         </address-setting>
      </address-settings>

      <addresses>
         <address name="DLQ">
            <anycast>
               <queue name="DLQ" />
            </anycast>
         </address>
         <address name="ExpiryQueue">
            <anycast>
               <queue name="ExpiryQueue" />
            </anycast>
         </address>
      <address name="clusterTopic">
            <multicast>
               <queue name="clusterQueue" />
            </multicast>
         </address>
      </addresses>
   </core>
</configuration>

备份服务器配置:

<?xml version='1.0'?>
<configuration xmlns="urn:activemq"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xmlns:xi="http://www.w3.org/2001/XInclude"
               xsi:schemaLocation="urn:activemq /schema/artemis-configuration.xsd">

   <core xmlns="urn:activemq:core" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="urn:activemq:core ">

        <name>0.0.0.0</name>

        <persistence-enabled>true</persistence-enabled>

        <!-- this could be ASYNCIO, MAPPED, NIO
           ASYNCIO: Linux Libaio
           MAPPED: mmap files
           NIO: Plain Java Files
       -->

        <paging-directory>../brokerC0/data/paging</paging-directory>

        <bindings-directory>../brokerC0/data/bindings</bindings-directory>

        <journal-directory>../brokerC0/data/journal</journal-directory>

        <large-messages-directory>../brokerC0/data/large-messages</large-messages-directory>

        <security-enabled>false</security-enabled>

        <connectors>
            <connector name="netty-connector">tcp://localhost:61617</connector>
            <!-- connector to the server0 -->
            <connector name="server0-connector">tcp://localhost:61616</connector>
        </connectors>

        <!-- Acceptors -->
        <acceptors>
            <acceptor name="netty-acceptor">tcp://localhost:61617</acceptor>
        </acceptors>

        <ha-policy>
           <shared-store>
              <slave>
                 <allow-failback>true</allow-failback>
              </slave>
           </shared-store>
        </ha-policy>

        <!-- Clustering configuration -->
        <cluster-connections>
            <cluster-connection name="my-cluster">
                <connector-ref>netty-connector</connector-ref>
                <retry-interval>500</retry-interval>
                <use-duplicate-detection>true</use-duplicate-detection>
                <message-load-balancing>STRICT</message-load-balancing>
                <max-hops>1</max-hops>
                <static-connectors>
                    <connector-ref>server0-connector</connector-ref>
                </static-connectors>
            </cluster-connection>
        </cluster-connections>

        <security-settings>
            <security-setting match="#">
                <permission type="createNonDurableQueue" roles="amq"/>
                <permission type="deleteNonDurableQueue" roles="amq"/>
                <permission type="createDurableQueue" roles="amq"/>
                <permission type="deleteDurableQueue" roles="amq"/>
                <permission type="createAddress" roles="amq"/>
                <permission type="deleteAddress" roles="amq"/>
                <permission type="consume" roles="amq"/>
                <permission type="browse" roles="amq"/>
                <permission type="send" roles="amq"/>
                <!-- we need this otherwise ./artemis data imp wouldn't work -->
                <permission type="manage" roles="amq"/>
            </security-setting>
            <security-setting match="clusterTopic">
                <permission type="createNonDurableQueue" roles="amq"/>
                <permission type="deleteNonDurableQueue" roles="amq"/>
                <permission type="createDurableQueue" roles="amq"/>
                <permission type="deleteDurableQueue" roles="amq"/>
                <permission type="createAddress" roles="amq"/>
                <permission type="deleteAddress" roles="amq"/>
                <permission type="consume" roles="amq"/>
                <permission type="browse" roles="amq"/>
                <permission type="send" roles="amq"/>
                <!-- we need this otherwise ./artemis data imp wouldn't work -->
                <permission type="manage" roles="amq"/>
            </security-setting>
            <security-setting match="clusterQueue">
                <permission type="createNonDurableQueue" roles="amq"/>
                <permission type="deleteNonDurableQueue" roles="amq"/>
                <permission type="createDurableQueue" roles="amq"/>
                <permission type="deleteDurableQueue" roles="amq"/>
                <permission type="createAddress" roles="amq"/>
                <permission type="deleteAddress" roles="amq"/>
                <permission type="consume" roles="amq"/>
                <permission type="browse" roles="amq"/>
                <permission type="send" roles="amq"/>
                <!-- we need this otherwise ./artemis data imp wouldn't work -->
                <permission type="manage" roles="amq"/>
            </security-setting>
        </security-settings>

        <address-settings>
            <!-- if you define auto-create on certain queues, management has to be auto-create -->
            <address-setting match="activemq.management#">
                <dead-letter-address>DLQ</dead-letter-address>
                <expiry-address>ExpiryQueue</expiry-address>
                <redelivery-delay>0</redelivery-delay>
                <!-- with -1 only the global-max-size is in use for limiting -->
                <max-size-bytes>-1</max-size-bytes>
                <message-counter-history-day-limit>10</message-counter-history-day-limit>
                <address-full-policy>PAGE</address-full-policy>
                <auto-create-queues>true</auto-create-queues>
                <auto-create-addresses>true</auto-create-addresses>
                <auto-create-jms-queues>true</auto-create-jms-queues>
                <auto-create-jms-topics>true</auto-create-jms-topics>
            </address-setting>
            <!--default for catch all-->
            <address-setting match="#">
                <dead-letter-address>DLQ</dead-letter-address>
                <expiry-address>ExpiryQueue</expiry-address>
                <redelivery-delay>0</redelivery-delay>
                <!-- with -1 only the global-max-size is in use for limiting -->
                <max-size-bytes>-1</max-size-bytes>
                <message-counter-history-day-limit>10</message-counter-history-day-limit>
                <address-full-policy>PAGE</address-full-policy>
                <auto-create-queues>true</auto-create-queues>
                <auto-create-addresses>true</auto-create-addresses>
                <auto-create-jms-queues>true</auto-create-jms-queues>
                <auto-create-jms-topics>true</auto-create-jms-topics>
            </address-setting>
        </address-settings>

        <addresses>
            <address name="DLQ">
                <anycast>
                    <queue name="DLQ" />
                </anycast>
            </address>
            <address name="ExpiryQueue">
                <anycast>
                    <queue name="ExpiryQueue" />
                </anycast>
            </address>
            <address name="clusterTopic">
                <multicast>
                    <queue name="clusterQueue" />
                </multicast>
            </address>
        </addresses>
   </core>
</configuration>

共有1个答案

梅飞宇
2023-03-14

如果您优雅地停止代理(例如使用Ctrl-C),则默认情况下不会发生故障转移。要通过合理的关机触发故障转移,您需要配置主代理,如下所示:

<ha-policy>
   <shared-store>
      <master>
         <failover-on-shutdown>true</failover-on-shutdown>
      </master>  
   </shared-store>
</ha-policy>

奴隶是这样的:

<ha-policy>
   <shared-store>
      <slave>
         <failover-on-shutdown>true</failover-on-shutdown>
      </slave>  
   </shared-store>
</ha-policy>

否则,您只需杀死代理(例如,使用kill-9 )来触发故障转移。

 类似资料:
  • 我正在尝试用6台机器实现一个Redis集群。我有一个由六台机器组成的流浪集群: 运行redis服务器 我编辑了上述所有服务器的/etc/redis/redis.conf文件,添加了这个 然后我在六台机器中的一台上运行了这个程序; Redis集群已启动并运行。我通过在一台机器上设置值手动检查它显示在其他机器上。 我的问题是,当我关闭或停止任何一台主机上的redis server时,整个集群都会停止运

  • 我在端口7000、7001和7002上设置了三个服务器(一个主服务器和两个从服务器),在端口26379、26380和26381上设置了三个哨兵,它们都在同一台机器(ubuntu VM)上。 当我启动它们时,根据日志,一切看起来都很好,当我对哨兵运行信息命令时,看起来也很健康。但是当我放下主程序(通过Ctrl+C或redis-cli SLEEP命令使其停止工作)时,没有一个从程序实例被引入为新的主程

  • 因此,如果我理解正确的话,在检测并重新启动失败代理的环境中运行Artemis代理集群将提供与运行每个活动服务器都与备份配对的集群相同的语义(以及类似的可用性)。对吗?

  • 我注意到,当连接的Artemis节点宕机时,连接到节点2-4的客户机不会故障转移到其他3个可用的主节点,基本上不会发现其他节点。即使在原始节点恢复之后,客户端仍然无法建立连接。我从一个单独的堆栈溢出帖子中看到,不支持主到主故障转移。这是否意味着对于每个主节点,我也需要创建一个从节点来处理故障转移?这是否会导致两个实例点失败,而不是集群中有许多节点? 在一个单独的基本测试中,使用一个主从两个节点的集

  • 目前,我正在使用ActiveMQ,并计划将系统迁移到ActiveMQ Artemis。目前,我有3个生产者和3个消费者,只有一个ActiveMQ服务器/代理。

  • 我试图创建一个简单的redis高可用性设置与1主,1从和2哨兵。 当从故障转移到时,该设置工作正常。当恢复时,它将自己正确地注册为新的主服务器的从服务器。 但是,当作为主服务器关闭时, 不能作为主服务器返回。的日志进入循环,显示: 每个复制文档都声明: 自Redis4.0以来,当一个实例在故障转移后被提升为master时,它仍然能够与旧master的从机执行部分重新同步。 但日志似乎显示了另一种情