当前位置: 首页 > 知识库问答 >
问题:

Apache ignite 2.7.5的间歇性性能问题

施永贞
2023-03-14

我们在ignite中面临着间歇性的性能问题,响应时间变得非常高,我们在日志中看到了下面的错误。我们有10个索引列,我没有看到索引有任何问题,因为“where”子句中的所有列都被索引了。联接发生在具有亲和性共定位的字段上,这意味着联接只发生在特定节点中的数据上,而不发生在Across ;节点上。

[21:48:30,765][WARNING][jvm-pause-detector-worker][IgniteKernal%PincodeGrid] Possible too long JVM pause: 4939 milliseconds.
[21:48:30,783][WARNING][query-#120%PincodeGrid%][IgniteH2Indexing] Query execution is too long [time=5052 ms, sql='SELECT

请让我知道你是否能在这方面提供任何帮助。 

>

  • Apache Ignite版本:2.7.5

    启用Ignite持久性(true)

    JVM选项- 

    /usr/java/jdk1.8.0_144/bin/java-xx:+extracsiveopts-server-xms20g-xmx20g-xx:+alwayspretouch-xx:+useg1gc-xx:+scavengebeforefullgc-xx:+disableexplicitgc-xx:+heapdumponoutofmemoryerror-xx:+heapdumponoutofmemoryerror-xx:heapdumppath=/etappdata/ignite/logs/prod/etail-prod-igniteEropotation-xx:numberofgclogfiles=10-xx:gclogfilesize=100m-xloggc://etappdata/ignite/logs/prod/etail-prod-ignite76-163/gc.log-xx:+printadaptiveSizePolicy-xx:+usetlab-verbose:+parallelrefprocenable-xx:+uselargepages-xx:+uselargepages-xx:+actrasiveopts-djava.net.preferipv4stack=true port=8996-dcom.sun.management.jmxremote.rmi.port=8996-dcom.sun.management.jmxremote.ssl=false-dcom.sun.management.jmxremote.local.only=false-djava.rmi.server.hostname=etail-prod-ignite76-163-xx:maxdirectmemorysize=4g-javaagent://tmp/apminsight jmxremote.port=49112-dcom.sun.management.jmxremote.authenticate=false-dcom.sun.management.jmxremote.ssl=false-dignite_home=/ignite/apache-ignite-2.7.5-bin-dignite_prog_name=./bin/ignite.sh-cp/ignite/apache-ignite-2.7.5-bin/libs/://ignite/apache-ignite-2.7.5-bin/libs/://ignite/apache-ignite-2.7.5-bin/.apache.ignite.startup.cmdline.commandlineStartup config/config-cache.xml

    添加了其他详细信息

    编辑-2-GC日志

    2020-12-01T22:49:31.729+0530: 15.630: [GC pause (Metadata GC Threshold) (young) (initial-mark) 15.630: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 0, predicted base time: 38.43 ms, remaining time: 161.57 ms, target pause time: 200.00 ms]
    15.630: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 24 regions, survivors: 2 regions, predicted young region time: 356.58 ms]
    15.630: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 24 regions, survivors: 2 regions, old: 0 regions, predicted pause time: 395.01 ms, target pause time: 200.00 ms]
    15.657: [G1Ergonomics (Mixed GCs) do not start mixed GCs, reason: concurrent cycle is about to start], 0.0274990 secs]
    [Parallel Time: 15.8 ms, GC Workers: 21]
    [GC Worker Start (ms): Min: 15630.2, Avg: 15630.5, Max: 15630.8, Diff: 0.7]
    [Ext Root Scanning (ms): Min: 1.6, Avg: 3.4, Max: 11.4, Diff: 9.8, Sum: 71.8]
    [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
    [Processed Buffers: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0]
    [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
    [Code Root Scanning (ms): Min: 0.0, Avg: 1.2, Max: 12.6, Diff: 12.6, Sum: 24.2]
    [Object Copy (ms): Min: 0.0, Avg: 9.5, Max: 12.0, Diff: 11.9, Sum: 199.9]
    [Termination (ms): Min: 0.0, Avg: 0.7, Max: 0.8, Diff: 0.7, Sum: 14.8]
    [Termination Attempts: Min: 1, Avg: 2.2, Max: 4, Diff: 3, Sum: 47]
    [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.2, Diff: 0.1, Sum: 1.0]
    [GC Worker Total (ms): Min: 14.5, Avg: 14.8, Max: 15.2, Diff: 0.7, Sum: 311.8]
    [GC Worker End (ms): Min: 15645.3, Avg: 15645.3, Max: 15645.4, Diff: 0.1]
    [Code Root Fixup: 0.6 ms]
    [Code Root Purge: 0.0 ms]
    [Clear CT: 0.5 ms]
    [Other: 10.5 ms]
    [Choose CSet: 0.0 ms]
    [Ref Proc: 8.3 ms]
    [Ref Enq: 0.5 ms]
    [Redirty Cards: 0.5 ms]
    [Humongous Register: 0.0 ms]
    [Humongous Reclaim: 0.0 ms]
    [Free CSet: 0.2 ms]
    [Eden: 192.0M(1008.0M)->0.0B(984.0M) Survivors: 16.0M->40.0M Heap: 198.0M(20.0G)->33.7M(20.0G)]
    [Times: user=0.31 sys=0.00, real=0.03 secs]
    2020-12-01T22:49:31.757+0530: 15.657: [GC concurrent-root-region-scan-start]
    2020-12-01T22:49:31.764+0530: 15.664: [GC concurrent-root-region-scan-end, 0.0067826 secs]
    2020-12-01T22:49:31.764+0530: 15.664: [GC concurrent-mark-start]
    2020-12-01T22:49:31.765+0530: 15.666: [GC concurrent-mark-end, 0.0015043 secs]
    2020-12-01T22:49:31.766+0530: 15.666: [GC remark 2020-12-01T22:49:31.766+0530: 15.666: [Finalize Marking, 0.0010641 secs] 2020-12-01T22:49:31.767+0530: 15.667: [GC ref-proc, 0.0100232 secs] 2020-12-01T22:49:31.777+0530: 15.677: [Unloading, 0.0072592 secs], 0.0191010 secs]
    [Times: user=0.20 sys=0.00, real=0.02 secs]
    2020-12-01T22:49:31.785+0530: 15.685: [GC cleanup 37M->37M(20G), 0.0085803 secs]
    [Times: user=0.04 sys=0.00, real=0.01 secs]
    2020-12-01T22:53:45.090+0530: 268.990: [GC pause (G1 Evacuation Pause) (young) 268.990: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 0, predicted base time: 30.72 ms, remaining time: 169.28 ms, target pause time: 200.00 ms]
    268.990: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 123 regions, survivors: 5 regions, predicted young region time: 1342.47 ms]
    268.990: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 123 regions, survivors: 5 regions, old: 0 regions, predicted pause time: 1373.18 ms, target pause time: 200.00 ms]
    269.040: [G1Ergonomics (Mixed GCs) do not start mixed GCs, reason: candidate old regions not available]
    , 0.0494933 secs]
    [Parallel Time: 31.8 ms, GC Workers: 21]
    [GC Worker Start (ms): Min: 268991.8, Avg: 268992.1, Max: 268992.5, Diff: 0.7]
    [Ext Root Scanning (ms): Min: 0.9, Avg: 1.9, Max: 5.7, Diff: 4.7, Sum: 39.6]
    [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
    [Processed Buffers: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0]
    [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.5]
    [Code Root Scanning (ms): Min: 0.0, Avg: 1.0, Max: 7.0, Diff: 7.0, Sum: 20.2]
    [Object Copy (ms): Min: 21.7, Avg: 28.1, Max: 29.1, Diff: 7.4, Sum: 591.0]
    [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
    [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 21]
    [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.8]
    
  • 共有1个答案

    李良策
    2023-03-14

    如果您有20G堆,您可能预计最终会有一个完整的GC需要4秒。我可以相信。为什么使用Apache Ignite需要这么多堆?在正常操作过程中使用了多少堆?您还可以进行堆转储并搜索内存泄漏。

    默认情况下,Apache Ignite的节点不将数据存储在堆上,因此不能仅用数据量来解释。

    我通过gceasy.io运行了您的GC日志,它发现了几个跨越大约2秒的GC。我不确定它能解释你观察到的4s停顿,但你可能会从GC那里看到明显的2s停顿,这是在相同的范围内。

    因此,您需要弄清楚为什么JVM有时会变慢。可能是因为IO、虚拟化暂停等。此外,如果您的堆从未大于2G,那么也许您应该使用类似-xmx4G的代码来运行?

     类似资料:
    • 问题内容: 我们的数据库中有一个函数,该函数搜索两个大表以查看是否存在值。这是一个相当大的查询,但已对其进行了优化以使用索引,并且通常运行速度非常快。 在过去的2周中,此功能有3次决定进入麻烦境地,并且运行极其缓慢,这会导致死锁和性能下降。即使在少于高峰使用时间的情况下,也会发生这种情况。 在SQL Server中使用“更改功能”重建功能似乎可以解决此问题。完成后,服务器使用率将恢复正常,一切正常

    • 我在两个不同的API项目上断断续续地收到以下错误。API项目已正常运行近一年,没有变化。当错误开始发生时,它会以持续几分钟的突发方式发生,然后再次开始正常运行。错误事件每小时发生几次,我们在12小时前的昨晚早些时候注意到错误发生。感谢任何帮助。 接口范围:www.googleapis.com/auth/userinfo.profile www.googleapis.com/auth/userinf

    • 在这个问题上我有点绝望:我们正在为我们的API运行AWS Lambda,该API与MongoDB Atlas (M20)上的MongoDB集群对话。为了防止在每次lambda调用时创建新的连接,我们遵循以下模式:https://docs . atlas . MongoDB . com/best-practices-connecting-to-AWS-Lambda/在Lambda容器的生命周期内缓存

    • 问题内容: 我有一个通过Apache https proxypass在Tomcat上运行的Java Spring Web应用程序,当它尝试访问安全的IBM Watson服务时间歇性地失败。Apache通过LetsEncrypt证书进行安全保护,该证书重定向到Tomcat端口8080。 环境: Java:jdk1.7.0_80 Solaris 10 Tomcat 8.0.33 Apache2.4.1

    • 我已经编写了一个程序,从usb摄像头捕获图像,并根据检测到的每帧颜色跟踪对象的位置。间歇性(可能在1分钟、10分钟或半小时后发生)我收到错误消息: 我遵循的流程是: 用相机拍摄一帧,cap=cv2。视频捕获(1)/cap。read() 变换几何,cv2。透视图 高斯模糊滤波器,cv2。高斯模糊 BGR到HSV转换,cv2。CVT彩色(模糊帧,cv2.COLOR\u BGR2HSV) 轮廓发现和分析

    • 我在打Rest电话时遇到了tomcat的间歇性问题。设置: 应用程序在portX上运行独立的tomcat Application B在portY上的另一个独立tomcat上运行,两个tomcat安装都在同一台机器上。Java版本是JRE6 下面的参数被添加到bash profile-Dhttp中的JVM_OPTS中。代理主机=[主机]-Dhttp。proxyPort=[端口]-Dhttp。非Pro