当前位置: 首页 > 知识库问答 >
问题:

Infinispan 9.4.16,JBoss EAP 7.3使用复制缓存锁定争用2节点线程TIMED_WAITING(停车)

党航
2023-03-14

我有一个应用程序目前依赖infinispan复制缓存在所有节点上共享工作队列。队列是相当标准的,头、尾和大小指针一直存在于infinispan映射中。

我们已经从Infinispan 7.2.5升级到9.4.16,并注意到锁的性能比以前差得多。我成功地从这两个节点获取了线程转储,而这两个节点都试图同时初始化队列。Infinispan 7.2.5的锁定和同步性能非常好,没有任何问题。现在我们看到了锁定超时和更多的故障。

节点#1线程转储2021-04-20 13:45:13的部分堆栈跟踪:

"default task-2" #600 prio=5 os_prio=0 tid=0x000000000c559000 nid=0x1f8a waiting on condition [0x00007f4df3f72000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000006e1f4fec0> (a java.util.concurrent.CompletableFuture$Signaller)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1695)
    at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
    at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1775)
    at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
    at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:105)
    at org.infinispan.interceptors.impl.SimpleAsyncInvocationStage.get(SimpleAsyncInvocationStage.java:38)
    at org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:250)
    at org.infinispan.cache.impl.CacheImpl.lock(CacheImpl.java:1077)
    at org.infinispan.cache.impl.CacheImpl.lock(CacheImpl.java:1057)
    at org.infinispan.cache.impl.AbstractDelegatingAdvancedCache.lock(AbstractDelegatingAdvancedCache.java:286)
    at org.infinispan.cache.impl.EncoderCache.lock(EncoderCache.java:318)
    at com.siperian.mrm.match.InfinispanQueue.initialize(InfinispanQueue.java:88)

Node#2线程转储的部分堆栈跟踪:2021-04-20 13:45:04:

"default task-2" #684 prio=5 os_prio=0 tid=0x0000000011f26000 nid=0x3c60 waiting on condition [0x00007f55107e4000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x0000000746bd36d8> (a java.util.concurrent.CompletableFuture$Signaller)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1695)
    at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
    at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1775)
    at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
    at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:105)
    at org.infinispan.interceptors.impl.SimpleAsyncInvocationStage.get(SimpleAsyncInvocationStage.java:38)
    at org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:250)
    at org.infinispan.cache.impl.CacheImpl.lock(CacheImpl.java:1077)
    at org.infinispan.cache.impl.CacheImpl.lock(CacheImpl.java:1057)
    at org.infinispan.cache.impl.AbstractDelegatingAdvancedCache.lock(AbstractDelegatingAdvancedCache.java:286)
    at org.infinispan.cache.impl.EncoderCache.lock(EncoderCache.java:318)
    at com.siperian.mrm.match.InfinispanQueue.initialize(InfinispanQueue.java:88)

运行节点1的计算机控制台上弹出的客户端错误:

2021-04-20 13:45:49,069 ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (jgroups-15,infinispan-cleanse-cluster_192.168.0.24_cmx_system105,N1618938080334-63633(machine-id=M1618938080334)) ISPN000136: Error executing command LockControlCommand on Cache 'orclmdm-MDM_SAMPLE105/FUZZY_MATCH', writing keys []: org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 60 seconds for key QUEUE_TAIL_C_PARTY and requestor GlobalTx:N1618938080334-63633(machine-id=M1618938080334):429. Lock is held by GlobalTx:N1618938062946-60114(machine-id=M1618938062946):420
    at org.infinispan.util.concurrent.locks.impl.DefaultLockManager$KeyAwareExtendedLockPromise.get(DefaultLockManager.java:288)
    at org.infinispan.util.concurrent.locks.impl.DefaultLockManager$KeyAwareExtendedLockPromise.lock(DefaultLockManager.java:261)
    at org.infinispan.util.concurrent.locks.impl.DefaultLockManager$CompositeLockPromise.lock(DefaultLockManager.java:348)
    at org.infinispan.interceptors.locking.PessimisticLockingInterceptor.localLockCommandWork(PessimisticLockingInterceptor.java:208)
    at org.infinispan.interceptors.locking.PessimisticLockingInterceptor.lambda$new$0(PessimisticLockingInterceptor.java:46)
    at org.infinispan.interceptors.InvocationSuccessFunction.apply(InvocationSuccessFunction.java:25)
    at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.invokeQueuedHandlers(QueueAsyncInvocationStage.java:118)
    at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:81)
    at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:30)
    at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
    at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
    at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
    at org.infinispan.remoting.transport.AbstractRequest.complete(AbstractRequest.java:67)
    at org.infinispan.remoting.transport.impl.MultiTargetRequest.onResponse(MultiTargetRequest.java:102)
    at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52)
    at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1369)
    at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1272)
    at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:126)
    at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1417)
    at org.jgroups.JChannel.up(JChannel.java:816)
    at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:900)
    at org.jgroups.protocols.pbcast.STATE_TRANSFER.up(STATE_TRANSFER.java:128)
    at org.jgroups.protocols.RSVP.up(RSVP.java:163)
    at org.jgroups.protocols.FRAG2.up(FRAG2.java:177)
    at org.jgroups.protocols.FlowControl.up(FlowControl.java:339)
    at org.jgroups.protocols.FlowControl.up(FlowControl.java:339)
    at org.jgroups.protocols.pbcast.GMS.up(GMS.java:872)
    at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:240)
    at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1008)
    at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:734)
    at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:389)
    at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:590)
    at org.jgroups.protocols.BARRIER.up(BARRIER.java:171)
    at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:131)
    at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:203)
    at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:253)
    at org.jgroups.protocols.MERGE3.up(MERGE3.java:280)
    at org.jgroups.protocols.Discovery.up(Discovery.java:295)
    at org.jgroups.protocols.TP.passMessageUp(TP.java:1250)
    at org.jgroups.util.SubmitToThreadPool$SingleMessageHandler.run(SubmitToThreadPool.java:87)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

Infinispan配置:

<?xml version="1.0" encoding="UTF-8"?>
<infinispan
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="urn:infinispan:config:9.4 http://www.infinispan.org/schemas/infinispan-config-9.4.xsd"
        xmlns="urn:infinispan:config:9.4">    

    <jgroups>
        <stack-file name="mdmudp" path="$cmx.home$/jgroups-udp.xml" />
        <stack-file name="mdmtcp" path="$cmx.home$/jgroups-tcp.xml" />
    </jgroups>

    <cache-container name="MDMCacheManager" statistics="true"
        shutdown-hook="DEFAULT">
        <transport stack="mdmudp" cluster="infinispan-cluster"
            node-name="$node$" machine="$machine$" />

        <jmx domain="org.infinispan.mdm.hub"/>  

        <replicated-cache name="FUZZY_MATCH" statistics="true" unreliable-return-values="false">
            <locking isolation="READ_COMMITTED" acquire-timeout="60000"
                concurrency-level="5000" striping="false" />
            <transaction
                transaction-manager-lookup="org.infinispan.transaction.lookup.GenericTransactionManagerLookup"
                stop-timeout="30000" auto-commit="true" locking="PESSIMISTIC"
                mode="NON_XA" notifications="true" />
        </replicated-cache>

    </cache-container>
</infinispan>

默认情况下,我们使用udp多播,以下是udp配置:

<!--
  Default stack using IP multicasting. It is similar to the "udp"
  stack in stacks.xml, but doesn't use streaming state transfer and flushing
  author: Bela Ban
-->

<config xmlns="urn:org:jgroups"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups.xsd">
    <UDP
         mcast_port="${jgroups.udp.mcast_port:46688}"
         ip_ttl="4"
         tos="8"
         ucast_recv_buf_size="5M"
         ucast_send_buf_size="5M"
         mcast_recv_buf_size="5M"
         mcast_send_buf_size="5M"
         max_bundle_size="64K"
         enable_diagnostics="true"
         thread_naming_pattern="cl"

         thread_pool.enabled="true"
         thread_pool.min_threads="2"
         thread_pool.max_threads="8"
         thread_pool.keep_alive_time="5000"/>

    <PING />
    <MERGE3 max_interval="30000"
            min_interval="10000"/>
    <FD_SOCK/>
    <FD_ALL/>
    <VERIFY_SUSPECT timeout="1500"  />
    <BARRIER />
    <pbcast.NAKACK2 xmit_interval="500"
                    xmit_table_num_rows="100"
                    xmit_table_msgs_per_row="2000"
                    xmit_table_max_compaction_time="30000"
                    use_mcast_xmit="false"
                    discard_delivered_msgs="true"/>
    <UNICAST3 xmit_interval="500"
              xmit_table_num_rows="100"
              xmit_table_msgs_per_row="2000"
              xmit_table_max_compaction_time="60000"
              conn_expiry_timeout="0"/>
    <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
                   max_bytes="4M"/>
    <pbcast.GMS print_local_addr="true" join_timeout="2000"
                view_bundling="true"/>
    <UFC max_credits="2M"
         min_threshold="0.4"/>
    <MFC max_credits="2M"
         min_threshold="0.4"/>
    <FRAG2 frag_size="60K"  />
    <RSVP resend_interval="2000" timeout="10000"/>
    <pbcast.STATE_TRANSFER />
    <!-- pbcast.FLUSH  /-->
</config>

任何关于配置的想法都很好。发生的情况是节点超时和队列都没有正确初始化(空键)。提前谢谢。顺便说一句,每个节点上总共有24个线程(总共48个)可以访问共享队列。

共有1个答案

龙成仁
2023-03-14

我做了一些研究,结果发现对复制缓存的锁定首先是对远程节点进行的,然后再尝试在本地锁定密钥。我相信如果node1尝试锁定node2,同时node2尝试锁定node1,死锁是可能的。因此,我已经将所有缓存更改为使用Flag.FAIL_SILENTLY和Flag.ZERO_LOCK_ACQUISITION_TIMEOUT,并在添加或删除队列中的元素时在客户端添加了额外的重试逻辑。从最初的测试看,现在情况好多了。

我很好奇Infinispan 7和更高版本之间发生了什么变化,使得悲观锁定在新版本中的性能更差。旧的客户端代码(没有标志或重试逻辑)在以前相同的测试条件下工作得很好。我对与使用futures和forkJoinPool相关的变化持怀疑态度,因为我在其他项目中使用futures和forkJoinPool时遇到了问题,不得不回到使用标准执行者的老方法。

 类似资料:
  • 我们尝试使用infinispan作为带有读锁的远程缓存。客户通过“put”进行读取,以获得钥匙锁,正如悲观事务缓存“When cache.put(k1,v1)返回时”一节中所述的infinispan文档,k1被锁定,集群中任何地方运行的其他事务都无法对其进行写入。仍然可以读取k1。当事务完成(提交或回滚)时,k1上的锁被释放。因此,该场景: 远程缓存配置为具有悲观锁定的事务性缓存: 客户端正在使用

  • 节点复制是另一种追踪式收集,在回收阶段和标记-清除采用了不同的策略,简单地说,标记-清除主动释放垃圾,而节点复制是将可达集拷贝出来,然后统一回收整块内存 前面说过,据统计80%~98%的对象在建立之后很快就会销毁,由此可以预见,在正常情况下,当启动垃圾回收的时候,可达集和垃圾集合是不成比例的,堆空间中很可能只有少部分的内容可达,剩下都是垃圾,标记-清除算法的时间和对象数量相关,而节点复制由于只拷贝

  • 我有一个tomcat服务器,可以处理一些rest API请求。这个tomcat崩溃是由于某些输入中的一个特定rest请求内存不足导致的,这会导致大量堆大小的使用,从而导致所有站点崩溃。 我想限制这个Rest请求内存使用我怎么能做到呢?我通常想保护tomcat免受大内存使用请求崩溃的影响。我怎么能做到呢?也许以某种方式限制所有线程最大堆大小?

  • 我们有多个带有SpringBoot应用程序的节点,使用Hibernate和Ehcache(配置为与其他节点对话,我们在启动时对其IPs进行硬编码)。

  • 问题内容: 我使用AJAX从ASP.Net网站的页面中使用了WCF REST服务。 我希望能够从服务异步中调用方法,这意味着我将在JavaScript代码中包含回调处理程序,并且当方法完成时,输出将被更新。这些方法应在不同的线程中运行,因为每种方法将花费不同的时间来完成其任务 我的代码无法正常工作,但是发生了一些奇怪的事情,因为 在编译后 第一次执行代码时 ,它可以正常工作 , 在不同的线程中运行

  • 我刚刚开始研究Java的类和方法。根据API,生成的线程池重用现有的对象来执行新任务。 我有点困惑这是如何实现的,因为我在API中找不到任何方法可以设置现有对象的行为。 例如,可以从对象创建新的,这使得调用的方法。但是,API中没有将作为参数的setter方法。 我会很感激你的指点。