我们正在使用一个3站点,每个站点3个节点的Cassandra 1.1.12集群,每个节点分配了8GB内存。我们定期在节点上看到长时间的GC暂停,这扰乱了我们的应用程序实时要求。我们运行的系统是8个核心系统,具有24GB内存。
我们已经看到了120秒的暂停,它会停止世界GC。
我们在JDK 1.7.0_04上运行这些标志
-XX:+UseThreadPriorities
-XX:ThreadPriorityPolicy=42
-Xms8G
-Xmx8G
-Xmn1600M
-XX:+HeapDumpOnOutOfMemoryError
-Xss180k
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintHeapAtGC
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintPromotionFailure
-XX:PrintFLSStatistics=1
以下是导致长时间暂停的详细GC日志:
2014-02-23T11:50:19.231-0500: 2119627.980: [GC Before GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 145905916 Max Chunk Size: 3472057 Number of Blocks: 160577 Av. Block Size: 908 Tree Height: 146 Before GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 0 Max Chunk Size: 0 Number of Blocks: 0 Tree Height: 0 2119627.981: [ParNew Desired survivor size 83886080 bytes, new threshold 1 (max 1) - age 1: 32269664 bytes, 32269664 total : 1356995K->44040K(1474560K), 4.5270760 secs] 5829345K->4546031K(8224768K)After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 144857324 Max Chunk Size: 2423465 Number of Blocks: 160577 Av. Block Size: 902 Tree Height: 146 After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 0 Max Chunk Size: 0 Number of Blocks: 0 Tree Height: 0 , 4.5295190 secs] [Times: user=24.78 sys=0.07, real=4.52 secs] Heap after GC invocations=82068 (full 2561): par new generation total 1474560K, used 44040K [0x00000005fae00000, 0x000000065ee00000, 0x000000065ee00000) eden space 1310720K, 0% used [0x00000005fae00000, 0x00000005fae00000, 0x000000064ae00000) from space 163840K, 26% used [0x000000064ae00000, 0x000000064d902028, 0x0000000654e00000) to space 163840K, 0% used [0x0000000654e00000, 0x0000000654e00000, 0x000000065ee00000) concurrent mark-sweep generation total 6750208K, used 4501991K [0x000000065ee00000, 0x00000007fae00000, 0x00000007fae00000) concurrent-mark-sweep perm gen total 40012K, used 23966K [0x00000007fae00000, 0x00000007fd513000, 0x0000000800000000) } Total time for which application threads were stopped: 4.5524690 seconds {Heap before GC invocations=82068 (full 2561): par new generation total 1474560K, used 1354760K [0x00000005fae00000, 0x000000065ee00000, 0x000000065ee00000) eden space 1310720K, 100% used [0x00000005fae00000, 0x000000064ae00000, 0x000000064ae00000) from space 163840K, 26% used [0x000000064ae00000, 0x000000064d902028, 0x0000000654e00000) to space 163840K, 0% used [0x0000000654e00000, 0x0000000654e00000, 0x000000065ee00000) concurrent mark-sweep generation total 6750208K, used 4501991K [0x000000065ee00000, 0x00000007fae00000, 0x00000007fae00000) concurrent-mark-sweep perm gen total 40012K, used 23966K [0x00000007fae00000, 0x00000007fd513000, 0x0000000800000000) 2014-02-23T11:51:14.221-0500: 2119682.970: [GC Before GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 144857324 Max Chunk Size: 2423465 Number of Blocks: 160577 Av. Block Size: 902 Tree Height: 146 Before GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 0 Max Chunk Size: 0 Number of Blocks: 0 Tree Height: 0 2119682.971: [ParNew Desired survivor size 83886080 bytes, new threshold 1 (max 1) - age 1: 41744088 bytes, 41744088 total : 1354760K->52443K(1474560K), 2.1589280 secs] 5856751K->4582809K(8224768K)After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 143937754 Max Chunk Size: 1505947 Number of Blocks: 160575 Av. Block Size: 896 Tree Height: 146 After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 0 Max Chunk Size: 0 Number of Blocks: 0 Tree Height: 0 , 2.1613420 secs] [Times: user=12.82 sys=0.04, real=2.16 secs] Heap after GC invocations=82069 (full 2561): par new generation total 1474560K, used 52443K [0x00000005fae00000, 0x000000065ee00000, 0x000000065ee00000) eden space 1310720K, 0% used [0x00000005fae00000, 0x00000005fae00000, 0x000000064ae00000) from space 163840K, 32% used [0x0000000654e00000, 0x0000000658136e38, 0x000000065ee00000) to space 163840K, 0% used [0x000000064ae00000, 0x000000064ae00000, 0x0000000654e00000) concurrent mark-sweep generation total 6750208K, used 4530365K [0x000000065ee00000, 0x00000007fae00000, 0x00000007fae00000) concurrent-mark-sweep perm gen total 40012K, used 23966K [0x00000007fae00000, 0x00000007fd513000, 0x0000000800000000) } Total time for which application threads were stopped: 2.1719930 seconds {Heap before GC invocations=82069 (full 2561): par new generation total 1474560K, used 1363163K [0x00000005fae00000, 0x000000065ee00000, 0x000000065ee00000) eden space 1310720K, 100% used [0x00000005fae00000, 0x000000064ae00000, 0x000000064ae00000) from space 163840K, 32% used [0x0000000654e00000, 0x0000000658136e38, 0x000000065ee00000) to space 163840K, 0% used [0x000000064ae00000, 0x000000064ae00000, 0x0000000654e00000) concurrent mark-sweep generation total 6750208K, used 4530365K [0x000000065ee00000, 0x00000007fae00000, 0x00000007fae00000) concurrent-mark-sweep perm gen total 40012K, used 23966K [0x00000007fae00000, 0x00000007fd513000, 0x0000000800000000) 2014-02-23T11:52:33.089-0500: 2119761.837: [GC Before GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 143937754 Max Chunk Size: 1505947 Number of Blocks: 160575 Av. Block Size: 896 Tree Height: 146 Before GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 0 Max Chunk Size: 0 Number of Blocks: 0 Tree Height: 0 2119761.839: [ParNew Desired survivor size 83886080 bytes, new threshold 1 (max 1) - age 1: 37906760 bytes, 37906760 total : 1363163K->48710K(1474560K), 3.5105890 secs] 5893529K->4611208K(8224768K)After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 142756036 Max Chunk Size: 326281 Number of Blocks: 160573 Av. Block Size: 889 Tree Height: 146 After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 0 Max Chunk Size: 0 Number of Blocks: 0 Tree Height: 0 , 3.5130550 secs] [Times: user=18.81 sys=0.08, real=3.52 secs] Heap after GC invocations=82070 (full 2561): par new generation total 1474560K, used 48710K [0x00000005fae00000, 0x000000065ee00000, 0x000000065ee00000) eden space 1310720K, 0% used [0x00000005fae00000, 0x00000005fae00000, 0x000000064ae00000) from space 163840K, 29% used [0x000000064ae00000, 0x000000064dd91af8, 0x0000000654e00000) to space 163840K, 0% used [0x0000000654e00000, 0x0000000654e00000, 0x000000065ee00000) concurrent mark-sweep generation total 6750208K, used 4562497K [0x000000065ee00000, 0x00000007fae00000, 0x00000007fae00000) concurrent-mark-sweep perm gen total 40012K, used 23966K [0x00000007fae00000, 0x00000007fd513000, 0x0000000800000000) } Total time for which application threads were stopped: 3.5236060 seconds {Heap before GC invocations=82070 (full 2561): par new generation total 1474560K, used 1359430K [0x00000005fae00000, 0x000000065ee00000, 0x000000065ee00000) eden space 1310720K, 100% used [0x00000005fae00000, 0x000000064ae00000, 0x000000064ae00000) from space 163840K, 29% used [0x000000064ae00000, 0x000000064dd91af8, 0x0000000654e00000) to space 163840K, 0% used [0x0000000654e00000, 0x0000000654e00000, 0x000000065ee00000) concurrent mark-sweep generation total 6750208K, used 4562497K [0x000000065ee00000, 0x00000007fae00000, 0x00000007fae00000) concurrent-mark-sweep perm gen total 40012K, used 23966K [0x00000007fae00000, 0x00000007fd513000, 0x0000000800000000) 2014-02-23T11:55:59.448-0500: 2119968.196: [GC Before GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 142756036 Max Chunk Size: 326281 Number of Blocks: 160573 Av. Block Size: 889 Tree Height: 146 Before GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 0 Max Chunk Size: 0 Number of Blocks: 0 Tree Height: 0 2119968.198: [ParNew (0: promotion failure size = 131074) (1: promotion failure size = 131074) (2: promotion failure size = 131074) (4: promotion failure size = 131074) (7: promotion failure size = 131074) (promotion failed) Desired survivor size 83886080 bytes, new threshold 1 (max 1) - age 1: 34318480 bytes, 34318480 total : 1359430K->1353373K(1474560K), 1.5971880 secs]2119969.795: [CMSCMS: Large block 0x000000073aa59270 : 4586726K->3600612K(6750208K), 149.1735470 secs] 5921928K->3600612K(8224768K), [CMS Perm : 23966K->23925K(40012K)]After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 403131826 Max Chunk Size: 403131826 Number of Blocks: 1 Av. Block Size: 403131826 Tree Height: 1 After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 0 Max Chunk Size: 0 Number of Blocks: 0 Tree Height: 0 , 150.7724630 secs] [Times: user=28.89 sys=11.57, real=150.75 secs] Heap after GC invocations=82071 (full 2562): par new generation total 1474560K, used 0K [0x00000005fae00000, 0x000000065ee00000, 0x000000065ee00000) eden space 1310720K, 0% used [0x00000005fae00000, 0x00000005fae00000, 0x000000064ae00000) from space 163840K, 0% used [0x0000000654e00000, 0x0000000654e00000, 0x000000065ee00000) to space 163840K, 0% used [0x000000064ae00000, 0x000000064ae00000, 0x0000000654e00000) concurrent mark-sweep generation total 6750208K, used 3600612K [0x000000065ee00000, 0x00000007fae00000, 0x00000007fae00000) concurrent-mark-sweep perm gen total 40012K, used 23925K [0x00000007fae00000, 0x00000007fd513000, 0x0000000800000000) } Total time for which application threads were stopped: 150.7833360 seconds
我还设置了一个夜间作业,强制GC在凌晨2点在实例上运行,希望它能缓解这些问题——它做了一些好事,但每隔几天节点仍会出现问题。
--增加了细节在将NewGen更改为2GB(从1600M)后,我们在运行的前12小时内看到了15秒的暂停。
... 类直方图转储:
15772.786: [Class Histogram (before full gc): num #instances #bytes class name ---------------------------------------------- 1: 9743656 526609104 [B 2: 9176097 440452656 java.nio.HeapByteBuffer 3: 8152787 326111480 java.math.BigInteger 4: 8126173 321393760 [I 5: 207997 307212904 [J 6: 8940730 214577520 java.lang.Long 7: 8121743 194921832 org.apache.cassandra.db.DecoratedKey 8: 8121399 129942384 org.apache.cassandra.dht.BigIntegerToken 9: 174374 78049856 [Ljava.lang.Object; 10: 914261 43884528 edu.stanford.ppl.concurrent.SnapTreeMap$Node 11: 1112269 35592608 java.util.concurrent.ConcurrentHashMap$HashEntry 12: 1101827 35258464 com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 13: 306955 29467680 edu.stanford.ppl.concurrent.CopyOnWriteManager$COWEpoch 14: 1216257 29190168 org.apache.cassandra.cache.KeyCacheKey 15: 1111387 26673288 com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$WeightedValue 16: 427695 20529360 edu.stanford.ppl.concurrent.SnapTreeMap$RootHolder 17: 417278 16691120 org.apache.cassandra.db.ExpiringColumn 18: 306955 9822560 edu.stanford.ppl.concurrent.CopyOnWriteManager$Latch 19: 401296 9631104 org.apache.cassandra.db.ISortedColumns$DeletionInfo 20: 305 8691760 [Ljava.util.concurrent.ConcurrentHashMap$HashEntry; 21: 528460 8455360 java.util.concurrent.atomic.AtomicReference 22: 263965 8446880 edu.stanford.ppl.concurrent.SnapTreeMap 23: 342775 8226600 org.apache.cassandra.db.ColumnFamily 24: 263965 6335160 org.apache.cassandra.db.AtomicSortedColumns$Holder 25: 193133 6180256 org.apache.cassandra.db.DeletedColumn 26: 179519 5744608 java.util.ArrayList$Itr 27: 221515 5316360 java.util.concurrent.ConcurrentSkipListMap$Node 28: 35607 5025400 29: 35607 4852600 30: 61748 4445856 org.apache.cassandra.io.sstable.SSTableIdentityIterator 31: 264409 4230544 edu.stanford.ppl.concurrent.SnapTreeMap$COWMgr 32: 52605 4216504 [Ljava.util.HashMap$Entry; 33: 3389 3868680 34: 221414 3542624 org.apache.cassandra.db.AtomicSortedColumns 35: 107140 3428480 org.apache.cassandra.db.Column 36: 69734 3347232 java.util.TreeMap 37: 131777 3162648 java.util.ArrayList 38: 111752 2682048 java.util.concurrent.ConcurrentSkipListMap$Index 39: 51879 2490192 java.util.HashMap 40: 3389 2438184 41: 2978 2284352 42: 55660 1781120 org.apache.cassandra.io.util.DataOutputBuffer 43: 69664 1671936 org.apache.cassandra.db.TreeMapBackedSortedColumns 44: 51697 1654304 org.apache.cassandra.db.ArrayBackedSortedColumns 45: 61842 1485456 [Ljava.nio.ByteBuffer; 46: 61748 1481952 org.apache.cassandra.utils.BytesReadTracker 47: 3533 1462144 48: 55672 1336128 org.apache.cassandra.io.util.FastByteArrayOutputStream 49: 21743 1318848 [C 50: 33109 1268616 [Lorg.apache.cassandra.db.IColumn; 51: 28607 1144280 java.util.HashMap$KeyIterator 52: 17623 1127616 [Ljava.util.Hashtable$Entry; 53: 35003 1120096 java.util.Vector 54: 17335 1109440 com.sun.jmx.remote.util.OrderClassLoaders 55: 17601 844848 java.util.Hashtable 56: 18528 741120 org.apache.cassandra.io.sstable.IndexHelper$IndexInfo 57: 22634 724288 java.lang.String 58: 17514 700560 java.security.ProtectionDomain 59: 19829 634528 java.util.HashMap$Entry 60: 25327 607848 org.apache.cassandra.utils.IntervalTree.Interval 61: 17513 560416 java.security.CodeSource 62: 22026 528624 java.lang.Double 63: 21740 521760 java.util.concurrent.LinkedBlockingDeque$Node 64: 18782 486728 [[J 65: 18660 447840 org.apache.cassandra.utils.BloomFilter 66: 18660 447840 org.apache.cassandra.utils.obs.OpenBitSet 67: 18529 444696 org.apache.cassandra.db.compaction.PrecompactedRow 68: 3717 444688 java.lang.Class 69: 18528 444672 org.apache.cassandra.db.ColumnIndexer$RowHeader 70: 18528 444672 org.apache.cassandra.utils.BloomCalculations$BloomSpecification 71: 24572 393152 java.lang.Object 72: 5753 335568 [S 73: 7326 293040 java.util.TreeMap$Entry 74: 17817 285072 java.util.HashSet 75: 5583 282488 [[I 76: 17514 280224 java.security.ProtectionDomain$Key 77: 17514 280224 [Ljava.security.Principal; 78: 6612 264480 org.apache.cassandra.service.WriteResponseHandler 79: 7862 251584 org.apache.cassandra.db.RowMutation 80: 10083 241992 org.apache.cassandra.db.EchoedRow 81: 6110 195520 org.apache.cassandra.utils.ExpiringMap$CacheableObject 82: 311 179136 83: 2927 163912 org.apache.cassandra.thrift.TBinaryProtocol 84: 6563 157512 org.apache.cassandra.net.Message 85: 6289 150936 org.apache.cassandra.net.Header 86: 6110 146640 org.apache.cassandra.net.CallbackInfo 87: 5996 143904 java.util.concurrent.LinkedBlockingQueue$Node 88: 8004 128064 java.util.HashMap$EntrySet 89: 7885 126160 java.util.TreeMap$Values 90: 7790 124640 java.util.concurrent.atomic.AtomicInteger 91: 5132 123168 org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder 92: 6640 106240 org.apache.cassandra.utils.SimpleCondition 93: 6626 106016 java.util.HashMap$Values 94: 830 92960 java.net.SocksSocketImpl 95: 1076 86080 java.lang.reflect.Method 96: 4176 66816 java.lang.Integer 97: 2005 64160 java.lang.ThreadLocal$ThreadLocalMap$Entry 98: 1592 63680 java.lang.ref.SoftReference 99: 1586 63440 com.google.common.collect.SingletonImmutableMap 100: 1743 55776 java.util.TreeMap$ValueIterator 101: 470 48880 java.lang.Thread 102: 1469 47008 java.util.concurrent.locks.AbstractQueuedSynchronizer$Node 103: 1041 41640 java.lang.ref.Finalizer 104: 1002 40080 java.util.LinkedHashMap$Entry 105: 1603 38472 com.google.common.collect.SingletonImmutableSet 106: 1586 38064 com.google.common.collect.ImmutableEntry 107: 435 34800 [Ljava.lang.ThreadLocal$ThreadLocalMap$Entry; 108: 970 31040 java.net.Inet4Address 109: 425 30600 java.lang.reflect.Constructor 110: 396 28512 java.lang.reflect.Field 111: 825 26400 java.net.Socket 112: 1050 25200 java.util.concurrent.atomic.AtomicLong 113: 687 21984 java.util.concurrent.SynchronousQueue$TransferStack$SNode 114: 387 21672 sun.security.provider.MD5 115: 451 21648 java.util.concurrent.ThreadPoolExecutor$Worker 116: 442 21216 java.net.SocketInputStream 117: 525 21000 org.apache.cassandra.thrift.TCustomSocket 118: 514 20560 org.apache.cassandra.db.CounterColumn 119: 639 20448 120: 353 19768 org.cliffc.high_scale_lib.ConcurrentAutoTable$CAT 121: 762 18288 java.io.FileDescriptor 122: 570 18240 org.apache.thrift.transport.TFramedTransport 123: 939 18200 [Ljava.lang.Class; 124: 546 17472 java.util.concurrent.locks.ReentrantLock$NonfairSync 125: 532 17464 [Ljava.lang.String; 126: 527 16864 java.util.Hashtable$Entry 127: 298 16688 org.apache.cassandra.service.ClientState$1 128: 298 16688 org.apache.cassandra.service.ClientState$2 129: 405 16200 org.apache.cassandra.thrift.Column 130: 323 15504 java.net.SocketOutputStream 131: 298 14304 org.apache.cassandra.service.ClientState 132: 585 14040 org.apache.thrift.transport.TMemoryInputTransport 133: 570 13680 org.apache.thrift.TByteArrayOutputStream 134: 284 13632 sun.nio.cs.UTF_8$Encoder 135: 405 12960 org.apache.cassandra.thrift.ColumnOrSuperColumn 136: 324 12960 java.io.BufferedInputStream 137: 530 12720 org.apache.cassandra.utils.EstimatedHistogram 138: 525 12600 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess 139: 388 12416 java.security.MessageDigest$Delegate 140: 129 12384 org.apache.cassandra.io.sstable.SSTableReader 141: 305 12200 java.util.concurrent.ConcurrentHashMap$Segment 142: 376 12032 org.apache.cassandra.db.columniterator.SSTableSliceIterator 143: 47 10904 [Z 144: 334 10688 org.apache.cassandra.net.OutboundTcpConnection$Entry 145: 330 10560 java.lang.ref.WeakReference 146: 435 10440 java.lang.ThreadLocal$ThreadLocalMap 147: 426 10224 java.util.BitSet 148: 314 10048 org.apache.cassandra.net.MessageDeliveryTask ... [removed anything taking less than 10K] Total 72302201 2937460496 , 3.1156640 secs] 15775.902: [CMSCMS: Large block 0x0000000714aace48 : 2751863K->2568755K(6340608K), 12.0184460 secs]15787.921: [Class Histogram (after full gc): num #instances #bytes class name ---------------------------------------------- 1: 8644126 451299632 [B 2: 8434549 404858352 java.nio.HeapByteBuffer 3: 7759859 310394360 java.math.BigInteger 4: 10686 293655512 [J 5: 7763265 248560200 [I 6: 8630461 207131064 java.lang.Long 7: 7759615 186230760 org.apache.cassandra.db.DecoratedKey 8: 7759483 124151728 org.apache.cassandra.dht.BigIntegerToken 9: 1825 64173960 [Ljava.lang.Object; 10: 1096613 35091616 java.util.concurrent.ConcurrentHashMap$HashEntry 11: 1092266 34952512 com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 12: 1092266 26214384 com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$WeightedValue 13: 1092266 26214384 org.apache.cassandra.cache.KeyCacheKey 14: 532463 25558224 edu.stanford.ppl.concurrent.SnapTreeMap$Node 15: 221414 21255744 edu.stanford.ppl.concurrent.CopyOnWriteManager$COWEpoch 16: 411588 16463520 org.apache.cassandra.db.ExpiringColumn 17: 341705 16401840 edu.stanford.ppl.concurrent.SnapTreeMap$RootHolder 18: 305 8691760 [Ljava.util.concurrent.ConcurrentHashMap$HashEntry; 19: 442919 7086704 java.util.concurrent.atomic.AtomicReference 20: 221414 7085248 edu.stanford.ppl.concurrent.CopyOnWriteManager$Latch 21: 221414 7085248 edu.stanford.ppl.concurrent.SnapTreeMap 22: 221515 5316360 java.util.concurrent.ConcurrentSkipListMap$Node 23: 221418 5314032 org.apache.cassandra.db.ColumnFamily 24: 221418 5314032 org.apache.cassandra.db.ISortedColumns$DeletionInfo 25: 221414 5313936 org.apache.cassandra.db.AtomicSortedColumns$Holder 26: 35593 5023912 27: 35593 4850696 28: 3382 3862240 29: 116355 3723360 org.apache.cassandra.db.DeletedColumn 30: 221414 3542624 org.apache.cassandra.db.AtomicSortedColumns 31: 221414 3542624 edu.stanford.ppl.concurrent.SnapTreeMap$COWMgr 32: 111752 2682048 java.util.concurrent.ConcurrentSkipListMap$Index 33: 3382 2433816 34: 2971 2279488 35: 3533 1462144 36: 11281 891344 [C 37: 3710 443904 java.lang.Class 38: 12509 400288 java.lang.String 39: 23066 369056 java.lang.Object 40: 5742 334480 [S 41: 5417 275792 [[I 42: 8620 206880 java.lang.Double 43: 8596 206304 java.util.concurrent.LinkedBlockingDeque$Node 44: 311 179136 45: 1631 139152 [Ljava.util.HashMap$Entry; 46: 4006 128192 org.apache.cassandra.db.Column 47: 3630 116160 java.util.HashMap$Entry 48: 1066 85280 java.lang.reflect.Method 49: 1310 62880 java.util.HashMap 50: 509 57008 java.net.SocksSocketImpl 51: 1178 47120 java.lang.ref.SoftReference 52: 228 40752 [[J 53: 411 29592 java.lang.reflect.Constructor 54: 703 28120 java.lang.ref.Finalizer 55: 367 26424 java.lang.reflect.Field 56: 248 25792 java.lang.Thread 57: 804 25728 java.lang.ThreadLocal$ThreadLocalMap$Entry 58: 554 22160 java.util.LinkedHashMap$Entry 59: 431 20688 java.net.SocketInputStream 60: 514 20560 org.apache.cassandra.db.CounterColumn 61: 835 20040 java.util.concurrent.atomic.AtomicLong 62: 912 17768 [Ljava.lang.Class; 63: 546 17472 java.util.concurrent.locks.ReentrantLock$NonfairSync 64: 524 17272 [Ljava.lang.String; 65: 535 17120 java.net.Inet4Address 66: 1068 17088 java.lang.Integer 67: 213 17040 [Ljava.lang.ThreadLocal$ThreadLocalMap$Entry; 68: 708 16992 java.util.ArrayList 69: 1043 16688 java.util.concurrent.atomic.AtomicInteger 70: 684 16416 java.io.FileDescriptor 71: 506 16192 java.util.Hashtable$Entry 72: 504 16128 java.net.Socket 73: 184 15296 [Ljava.util.Hashtable$Entry; 74: 264 14784 org.apache.cassandra.thrift.TBinaryProtocol 75: 305 12200 java.util.concurrent.ConcurrentHashMap$Segment 76: 506 12144 org.apache.cassandra.utils.EstimatedHistogram 77: 201 11256 org.cliffc.high_scale_lib.ConcurrentAutoTable$CAT 78: 117 11232 org.apache.cassandra.io.sstable.SSTableReader 79: 229 10992 java.util.concurrent.ThreadPoolExecutor$Worker 80: 47 10904 [Z 81: 320 10240 java.lang.ref.WeakReference 82: 319 10208 java.util.Vector 83: 181 10136 sun.security.provider.MD5 ... [removed anything taking up less than 10K] Total 65302653 2582483384 , 2.5709140 secs] 2951753K->2568755K(8183808K), [CMS Perm : 21044K->21000K(35192K)]After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 482780727 Max Chunk Size: 482780727 Number of Blocks: 1 Av. Block Size: 482780727 Tree Height: 1 After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 0 Max Chunk Size: 0 Number of Blocks: 0 Tree Height: 0 , 17.7063220 secs] [Times: user=16.57 sys=0.05, real=17.70 secs] Heap after GC invocations=1233 (full 21): par new generation total 1843200K, used 0K [0x00000005fae00000, 0x0000000677e00000, 0x0000000677e00000) eden space 1638400K, 0% used [0x00000005fae00000, 0x00000005fae00000, 0x000000065ee00000) from space 204800K, 0% used [0x000000065ee00000, 0x000000065ee00000, 0x000000066b600000) to space 204800K, 0% used [0x000000066b600000, 0x000000066b600000, 0x0000000677e00000) concurrent mark-sweep generation total 6340608K, used 2568755K [0x0000000677e00000, 0x00000007fae00000, 0x00000007fae00000) concurrent-mark-sweep perm gen total 35192K, used 21000K [0x00000007fae00000, 0x00000007fd05e000, 0x0000000800000000) } Total time for which application threads were stopped: 17.7081220 seconds
首先是一个额外的提示,考虑将其添加到GC设置中。它会在完整的GCs上转储一个柱状图,你可以用它来找出是什么严重占用了堆:
JVM_OPTS="$JVM_OPTS -XX:+PrintSafepointStatistics" JVM_OPTS="$JVM_OPTS -XX:+PrintClassHistogramBeforeFullGC" JVM_OPTS="$JVM_OPTS -XX:+PrintClassHistogramAfterFullGC"
也看看你的system.logMemoryM的消息eter.java.那些通常在这些情况下吐出,可以方便地获得更多线索。
从您的GC日志中的消息来看,您似乎遇到了促销失败:
2119968.198: [ParNew (0: promotion failure size = 131074) (1: promotion failure size = 131074) (2: promotion failure size = 131074) (4: promotion failure size = 131074) (7: promotion failure size = 131074) (promotion failed) Desired survivor size 83886080 bytes, new threshold 1 (max 1) - age 1: 34318480 bytes, 34318480 total
这意味着您的堆中有太长时间的对象,并且您的堆变得支离破碎,导致它没有足够的空间将对象从新一代提升到幸存者或旧一代。这对于你拥有的强大硬件来说是不正常的,但即使是在一台拥有70Gb的16核机器上,这种情况也确实发生在我身上,所以这里有一些事情需要连续查看,以便你能够确定问题。如果你有我告诉过你的柱状图,我们可以更快地得出结论,但下面是可以尝试的列表:
1.如果您启用了多线程压缩,请禁用它;
2.你的CPU怎么样?如果利用率高,将系统中默认为8个线程的并发压缩器减少到4个;
3.Cassandra 1.1在密钥缓存大小计算方面存在缺陷。它使用48字节的硬编码大小来计算用于密钥缓存的堆空间,如果应用程序生成的密钥长于此长度,则堆上的密钥缓存使用率将显著高于在casssandra中实际设置的值。yaml,它会造成大量堆压力,特别是那些长期存在的东西,会导致升级失败:
参考资料:https://git-wip-us.apache.org/repos/asf?p=cassandra.git; a=blob; f=src/java/org/apache/cassandra/service/CacheService.java; hb=02672936#l102
https://issues.apache.org/jira/browse/CASSANDRA-4315
您可以从nodetool info命令或适当的JMX MBean检查您的密钥缓存使用情况。如果它通常是满的,那可能是一个线索;不幸的是,修复程序在1.1中不可用。如果您有此问题,您应该升级到1.2或在不升级的情况下解决它,将您的密钥缓存大小的值降低cassandra.yaml显着;
4.是否使用行缓存?试着把它关掉。
5.阅读宽行或我通常称之为死亡行。如果您的应用程序正在生成宽行(1个键有很多列),并且正在尝试读取该行的一个大片段(
可以查看nodetool cfstats命令,查看列族的最大行大小。如果你有一个最大行大小为几百MBs或更多的CF,这可能是一个线索;
7.对于读/写重载,GC设置中的默认使用期限阈值标志很低,可能会导致提前升级。这很可能是你的问题的症状。尝试将TenuringReshold增加到一个较大的值,比如32,以防止过早晋升,这样ParNew可以收集年轻一代中的大部分垃圾。
您可以通过运行jstat命令来观察这个持续存在的问题和提升失败,并查看堆中的幸存者空间是如何利用的。我敢打赌,它们没有被充分利用,对象直接从伊甸园到旧一代。;)
8.如果都失败了,很可能你的堆不够大,无法处理你的工作量,所以你必须调整大小。这种可能性较小,你应该把它作为最后的手段,谨慎行事。
8A:增加你的体重。尝试将8GB的容量增加到2GB。也可以像上面那样提高寿命阈值;
8B:如果以上改善了情况,但没有100%消除问题,考虑增加你的MAX_HEAP_SIZE2GB一次。增加NEWGEN与它。确保它不超过1/4的MAX HEAP。
继续高达16GB,但不会超过这个数,因为您只有24GB的总内存。
关于我自己:我有上述所有问题,我一次一个地消除它们。目前,我使用24GB堆和6GB新根运行我的节点,使用32的持续阈值。我拥有的最大GC暂停是
希望这对你有帮助。
这里有几篇关于GC的文章:http://fedora.fiz-karlsruhe.de/docs/Wiki.jsp?page=Java堆&GC调整http://blog.ragozin.info/2011/10/java-cg-hotspots-cms-and-heap.htmlhttp://java.dzone.com/articles/how-tame-java-gc-pauseshttp://grokbase.com/t/cassandra/user/113qf50x4r/parnew-promotion-failed
我一直在我的应用程序中随机(内存溢出)崩溃,所以我开始分析我的堆。我注意到,如果我从活动A到活动B,堆会从27 MB增加到35 MB(由于懒惰加载许多图像)。但是,当我完成()活动B返回到活动A时,堆大小保持不变,即使使用GC操作!! 令人讨厌的是,再次进入活动B会将堆增加到42 MB。我可以这样做,因为五月的时候,堆只会不断增加。 这是我正在使用的惰性图像加载库: LazyListhttps:/
我对Java非常陌生,在仔细阅读文档之后,我发现自己陷入了困境。 我有一个使用JavaFXMediaPlayer播放wav文件的小程序。我的播放器对象有一个currentTimeProperty,我希望在播放期间以分钟:秒为单位显示该属性的输出。 所以我在一个函数的末尾有这样一个函数,它接收我的wav文件并初始化播放器: 然后我有: 这很有效。在wav播放过程中,my以毫秒为单位随当前时间更新。问
我有一个分布式缓存应用程序(内存绑定,由于与集群中其他节点的交互而具有网络I/O),运行在带有G1垃圾收集器的JVM1.7.0_51中。以下是JVM配置: 我把gc日志附上了几分钟。您在GC日志中看到任何问题以及JVM GC调优的任何需要吗?多谢! GC日志
我在生产中遇到了一个奇怪的问题,在操作了5年多后只发生过一次。我们发现,PSPermGen和“real”持续时间远远高于User+SYS时间。2016年2月4日,PSPermGen花了38.96秒,而以前的运行是0.2-0.3秒,并且没有释放内存。当User+SYS为0.3-0.4秒时,实时时间需要40秒,与以前的运行相比,这是异常高的。完全的GC并不经常发生,我也没有观察到任何与GC相关的错误。
问题内容: 有人可以从下面的数据中告诉我我的会话将持续多长时间吗?-我不确定哪个告诉我 问题答案: 通常,您可以说_session.gc_maxlifetime_ 指定自上次更改会话数据( 而不是 上次调用!)以来的最长生存时间。但是PHP的会话处理有点复杂。 因为会话数据由垃圾收集器,其仅由称为除去用的概率_session.gc_probability合_ 由devided session.gc