EM 不定期异常宕机,问题重复出现,之前几次因为忙于其它事,无力兼顾,等回头处理时,发现EM已恢复正常。
这次问题又重现,准备彻底解决,过程如下:
1. 重新启动EM失败,报错:
/u01/oracle/agent/core/12.1.0.5.0/bin/emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
Agent is Not Running
[oracle@dm01db01 ~]$ /u01/app/oracle/agent/core/12.1.0.5.0/bin/emctl start agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
Starting agent ........................................... failed.
Fatal agent error: State Manager failed at Startup
Fatal agent error: State Manager failed at Startup
Fatal agent error: State Manager failed at Startup
EMAgent is Thrashing. Exiting watchdog
Consult emctl.log and emagent.nohup in: /u01/oracle/agent/agent_inst/sysman/log
2.查看emctl.log日志,报错:
51537 :: Tue Nov 14 15:16:37 2017::AgentLifeCycle.pm:Watch dog processs id: 51692 exited with an exit code of 56
51537 :: Tue Nov 14 15:16:37 2017::AgentLifeCycle.pm: Exited loop retryCount=80 with retCode=1
51537 :: Tue Nov 14 15:16:37 2017::AgentLifeCycle.pm: StartCEMD Querying for the real status of the agent
51537 :: Tue Nov 14 15:16:38 2017::AgentLifeCycle.pm: StartCEMD live status of the agent is 1 after 0 retries.
51537 :: Tue Nov 14 15:16:38 2017::AgentLifeCycle.pm: Check agent status retCode=1
51537 :: Tue Nov 14 15:16:38 2017::TZ: EmctlLogAvailabilityMarker Operation=start Diag=failed
51537 :: Tue Nov 14 15:16:38 2017::Calling releaselobalLock
51537 :: Tue Nov 14 15:16:38 2017::AgentCommandLock:released lock on emctl lockfile
51537 :: Tue Nov 14 15:16:38 2017::Released agent command lock
51537 :: Tue Nov 14 15:16:38 2017::Cleaning up agent command lock
51537 :: Tue Nov 14 15:16:38 2017::AgentCommandLock:closed file handle of emctl lockfile
3.然后并没有什么卵用,继续搜寻有用信息,终于在gcagent_errors.log,发现报错:
[1:main] ERROR - agent main threw an error
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:3209)
at java.lang.String.<init>(String.java:215)
at java.lang.StringBuffer.toString(StringBuffer.java:585)
at oracle.sysman.gcagent.state.StateMgr.loadValue(StateMgr.java:622)
at oracle.sysman.gcagent.state.StateMgr.initStateMgr(StateMgr.java:691)
at oracle.sysman.gcagent.state.StateMgr.tmNotifier(StateMgr.java:1215)
at oracle.sysman.gcagent.tmmain.lifecycle.TMComponentSvc.invokeNotifier(TMComponentSvc.java:1009)
at oracle.sysman.gcagent.tmmain.lifecycle.TMComponentSvc.invokeInitializationStep(TMComponentSvc.java:1094)
at oracle.sysman.gcagent.tmmain.lifecycle.TMComponentSvc.doInitializationStep(TMComponentSvc.java:927)
at oracle.sysman.gcagent.tmmain.lifecycle.TMComponentSvc.notifierDriver(TMComponentSvc.java:823)
at oracle.sysman.gcagent.tmmain.TMMain.startup(TMMain.java:264)
at oracle.sysman.gcagent.tmmain.TMMain.agentMain(TMMain.java:565)
at oracle.sysman.gcagent.tmmain.TMMain.main(TMMain.java:554)
[1:main] ERROR - Critical error:
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:3209)
at java.lang.String.<init>(String.java:215)
at java.lang.StringBuffer.toString(StringBuffer.java:585)
at oracle.sysman.gcagent.state.StateMgr.loadValue(StateMgr.java:622)
at oracle.sysman.gcagent.state.StateMgr.initStateMgr(StateMgr.java:691)
at oracle.sysman.gcagent.state.StateMgr.tmNotifier(StateMgr.java:1215)
at oracle.sysman.gcagent.tmmain.lifecycle.TMComponentSvc.invokeNotifier(TMComponentSvc.java:1009)
at oracle.sysman.gcagent.tmmain.lifecycle.TMComponentSvc.invokeInitializationStep(TMComponentSvc.java:1094)
at oracle.sysman.gcagent.tmmain.lifecycle.TMComponentSvc.doInitializationStep(TMComponentSvc.java:927)
该异常应是java虚机内存不足导致,故调整EM 内存设置:
/u01/oracle/agent/agent_inst/sysman/config目录下emd.properties文件中原内容如下:
#
# To enable the metric browser, uncomment the following line
# This is a reloadable parameter
#
#_enableMetricBrowser=true
#
# These are the optional Java flags for the agent
#
agentJavaDefines=-Xmx520M -XX:MaxPermSize=256M
修改后:
#
# To enable the metric browser, uncomment the following line
# This is a reloadable parameter
#
#_enableMetricBrowser=true
#
# These are the optional Java flags for the agent
#
agentJavaDefines=-Xmx1024M -XX:MaxPermSize=256M
具体大小自己测试,调整内存参数后,该问题解决。