[排错]Version of exectask could not be retrieved from node “xxx”

龙令
2023-12-01
        在为客户的RAC环境添加节点的时候遭遇了Version of exectask could not be retrieved from node “xxx”的错误,RAC数据库的版本是11.2.0.3.0,操作系统版本是Red Hat Enterprise Linux Server 5.5 x86_64bit,下面详细讨论问题的处理过程。

1.执行添加节点的命令。
[grid@niccl02 bin]$ ./addNode.sh -silent "CLUSTER_NEW_NODES={niccl01}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={niccl01-vip}"

执行 添加节点 的预检查

正在检查节点的可访问性...
节点 "niccl02" 的节点可访问性检查已通过


正在检查等同用户...
用户 "grid" 的等同用户检查已通过
无法从节点 "niccl01" 检索 exectask 的版本        //遇到错误。

ERROR:
所有节点上的框架设置检查都失败
验证无法继续


在所有节点上预检查 添加节点 失败。

2.使用CVU工具预检查添加的节点。
[grid@niccl02 bin]$ cluvfy stage -pre nodeadd -n niccl01 -verbose

执行 添加节点 的预检查

正在检查节点的可访问性...

检查: 节点 "niccl02" 的节点可访问性
  目标节点                                  是否可访问?
  ------------------------------------  ------------------------
  niccl01                               是
结果:节点 "niccl02" 的节点可访问性检查已通过


正在检查等同用户...

检查: 用户 "grid" 的等同用户
  节点名                                   状态
  ------------------------------------  ------------------------
  niccl01                               通过
结果:用户 "grid" 的等同用户检查已通过
无法从节点 "niccl01" 检索 exectask 的版本        //遭遇同样的错误。

ERROR:
所有节点上的框架设置检查都失败
验证无法继续


在所有节点上预检查 添加节点 失败。
[grid@niccl02 bin]$ export LANG=en_US.UTF-8
[grid@niccl02 bin]$ cluvfy stage -pre nodeadd -n niccl01 -verbose

Performing pre-checks for node addition

Checking node reachability...

Check: Node reachability from node "niccl02"
  Destination Node                      Reachable?
  ------------------------------------  ------------------------
  niccl01                               yes
Result: Node reachability check passed from node "niccl02"


Checking user equivalence...

Check: User equivalence for user "grid"
  Node Name                             Status
  ------------------------------------  ------------------------
  niccl01                               passed
Result: User equivalence check passed for user "grid"
Version of exectask could not be retrieved from node "niccl01"        //错误的英文版

ERROR:
Framework setup check failed on all the nodes
Verification cannot proceed


Pre-check for node addition was unsuccessful on all the nodes.

3.参考METALINK文章:
        参考METALINK文章:“Cluvfy returns "Unsuccessful" for most commands; trace has exectask.sh: "cannot execute" / "permission denied", or scp: "not found" [ID 549667.1]”

文章的内容如下:
Cluvfy returns "Unsuccessful" for most commands; trace has exectask.sh: "cannot execute" / "permission denied", or scp: "not found" [ID 549667.1]

  修改时间 25-AUG-2011     类型 PROBLEM     状态 PUBLISHED  

In this Document
  Symptoms
     Examples:
  Changes
  Cause
     Cause #1 - Wrong permissions on exectask
     Cause #2 - scp in wrong location
  Solution
     Solution #1 - Adjust permissions on exectask.
     Solution #2 - Create a symlink to scp in the expected location.
     Scalability RAC Community
  References




Applies to:

Oracle Server - Standard Edition - Version: 10.2.0.1 to 11.2.0.2 - Release: 10.2 to 11.2
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 11.2.0.2   [Release: 10.2 to 11.2]
Information in this document applies to any platform.
***Checked for relevance on 19-Oct-2010***

Symptoms


Cluster is functioning perfectly, but running the Cluster Verification Utility (cluvfy) with any commands always returns "unsuccessful" with no further details, even with -verbose switch.

Examples:

1.
#cluvfy comp admprv -n mynode1,mynode2 -verbose -o crs_inst
Verifying administrative privileges
Checking user equivalence...
Check: User equivalence for user "root"
  Node Name                             Comment
  ------------------------------------  ------------------------
  mynode1                               passed
  mynode2                               passed
Result: User equivalence check passed for user "root".
Verification of administrative privileges was unsuccessful on all the nodes.

2.
#cluvfy comp nodecon -n mynode1,mynode2 -verbose
Verifying node connectivity
Verification of node connectivity was unsuccessful on all the nodes.

3.
#cluvfy comp crs -n mynode1,mynode2
Verifying CRS integrity
Verification of CRS integrity was unsuccessful on all the nodes.

4.
 ./runcluvfy.sh stage -post crsinst -n mynode1,mynode2 -verbose

Performing post-checks for cluster services setup 

Checking node reachability... 

Check: Node reachability from node "mynode1" 
Destination                          Node Reachable? 
------------------------------------ ------------------------ 
mynode1                              yes 
mynode2                              yes 
Result: Node reachability check passed from node "mynode1". 


Checking user equivalence... 

Check: User equivalence for user "oracle" 
Node Name                        Comment 
-------------------------------- ------------------------ 
mynode1                         passed 
mynode2                         passed 
Result: User equivalence check passed for user "oracle". 

Post-check for cluster services setup was unsuccessful on all the nodes. 

Changes

The issue can occur before, during or after installation.  After installation, the triggering event can be the application of a patch, or a change in the OS environment such as permissions change.  In some cases on 10.2 and 11.1 the issue was simply not noticed until after installation.

Cause

There are a couple of possible causes for this reason. To check for which cause you may be encountering, trace cvu and examine cvutrace.log .

For instructions on tracing cluvfy, please see this note:
Note 316817.1 - CLUSTER VERIFICATION UTILITY FAQ 
under the section "How do I turn on tracing?"

If using runcluvfy.sh, see this note for tracing instructions:
Note 986822.1 - How to Collect CVU Trace / Debug Output Generated by RUNCLUVFY.SH 

Cause #1 - Wrong permissions on exectask

After turning CVU tracing on, the following message can be seen in the cvutrace.log :
ksh: /tmp/CVU_10.2.0.1.0.1_dba/exectask.sh: cannot execute
Or, in an 11.2 install, in which the installer runs cluvfy, the installActionsDATE.log shows "exectask.sh: Execute permission denied", for example:
sh: /var/tmp/CVU_11.2.0.2.0_grid/exectask.sh: Execute permission denied.
Cause: 
a) For already installed clusterware: Some CRS patches may change the permissions on $CRS_HOME/cv/remenv/exectask* .  If $CRS_HOME/cv/remenv/exectask* is not executable, then cluvfy will fail.
b) Previous runs of cvu can leave /tmp/CVU* or /var/tmp/CVU* directories in place; the permissions of exectask  must be correct in this location too, or cluvfy will fail.

Cause #2 - scp in wrong location

After turning CVU tracing on, the following message can be seen in the cvutrace.log :
exec exception:/usr/local/bin/scp: not found;
Cause: scp is not located in /usr/local/bin/scp but in a different place on the server. 

Solution

Use either solution 1 or solution 2, depending on which of the above causes applies to your situation. 

Solution #1 - Adjust permissions on exectask.


Steps:  
1. Check the permissions with: 
ls -l $CRS_HOME/cv/remenv/exectask*

2. Ensure that the files are owned by 
oracle:oinstall
AND
permissions are 744 or 755

3.  Check for the existence of a /tmp/CVU* or /var/tmp/CVU* directory created by previous, failed run of cluvfy.  If the directory exists and exectask* exists in this directory, then make the same change to the permissions on these exectask* files as well.

Solution #2 - Create a symlink to scp in the expected location.


Steps:  
1. Find scp
2. As root user, create a symlink to /usr/local/bin

eg. if scp is located in /usr/bin, create a symlink as follows: 
# ln -s /usr/bin/scp /usr/local/bin/scp

@ For searchability:

Scalability RAC Community

To discuss this topic further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the  My Oracle Support Scalability RAC Community .

References

BUG:5594840  - CVU CHECK FOR CRS IS UNSUCCESSFUL AFTER BUNDLE II PATCH IS APPLIED
NOTE:316817.1  - Cluster Verification Utility (CLUVFY) FAQ
NOTE:396381.1  - Cluvfy Pre and Post Check For Clusterware Is Unsuccessful After Applying CRS Bundle Patch#2
NOTE:986822.1  - How to Collect CVU Trace / Debug Output Generated by RUNCLUVFY.SH  


根据这篇文章的内容做出如下的操作:

1).检查exectask的权限,所属用户要有执行权限:
[grid@niccl02 bin]$ cd $GRID_HOME/cv/remenv/
[grid@niccl02 remenv]$ ll
total 2336
-rwxr-xr-x 1 grid oinstall    3513 Mar 21 11:51 cvuhelper
-rw-r--r-- 1 grid oinstall    8551 Mar 21 11:51 cvuqdisk-1.0.9-1.rpm
-rwxr-xr-x 1 grid oinstall 2285975 Mar 21 11:51 exectask
-rwxr-xr-x 1 grid oinstall     606 Mar 21 11:51 exectask.sh
-rwxr-xr-x 1 grid oinstall   61989 Mar 21 11:51 orarun.sh
drwxr-xr-x 3 grid oinstall    4096 Mar 21 11:50 pluggable
-rwxr-xr-x 1 grid oinstall    1495 Mar 21 11:51 runfixup.sh

2).删除/tmp目录下和CVU相关的文件夹及文件:
[grid@niccl02 remenv]$ cd /tmp
[root@niccl02 tmp]# rm -rf CVU*

3).创建scp链接文件:
[root@niccl02 tmp]# ln -s /usr/bin/scp /usr/local/bin/scp

4).验证:
[root@niccl02 tmp]# su - grid
[grid@niccl02 ~]$ exec /usr/bin/ssh-agent $SHELL
[grid@niccl02 ~]$ /usr/bin/ssh-add
Enter passphrase for /home/grid/.ssh/id_rsa:
Identity added: /home/grid/.ssh/id_rsa (/home/grid/.ssh/id_rsa)
Identity added: /home/grid/.ssh/id_dsa (/home/grid/.ssh/id_dsa)
Identity added: /home/grid/.ssh/identity (/home/grid/.ssh/identity)
        建立节点的有效性验证之后,添加节点同样出现以上的错误!

4.强制忽略错误:
        通过查看addNode.sh的脚本内容,发现addNode.sh使用CVU验证添加的节点这个过程可以被忽略。
[grid@niccl02 ~]$ cd $GRID_HOME/
[grid@niccl02 grid]$ cd oui/bin
[grid@niccl02 bin]$ export IGNORE_PREADDNODE_CHECKS=Y        //将此环境变量设置为Y,执行addNode.sh脚本就会忽略预检查过程。
[grid@niccl02 bin]$ ./addNode.sh -silent "CLUSTER_NEW_NODES={niccl01}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={niccl01-vip}"
正在启动 Oracle Universal Installer...

检查交换空间: 必须大于 500 MB。   实际为 8001 MB    通过
Oracle Universal Installer, 版本 11.2.0.3.0 Production
版权所有 (c) 1999, 2011, Oracle。保留所有权利。


执行测试以检查节点 niccl01 是否可用
............................................................... 100% 已完成。

.
-----------------------------------------------------------------------------
添加集群节点概要
全局设置
   源: /u01/app/11.2.0/grid
   新节点
空间要求
   新节点
      niccl01
         /: 要求空间 4.73GB: 可用空间 733.66GB
已安装产品
   产品名
      Oracle Grid Infrastructure 11.2.0.3.0
      Sun JDK 1.5.0.30.03
      Installer SDK Component 11.2.0.3.0
      Oracle One-Off Patch Installer 11.2.0.1.7
      Oracle Universal Installer 11.2.0.3.0
      Oracle USM Deconfiguration 11.2.0.3.0
      Oracle Configuration Manager Deconfiguration 10.3.1.0.0
      Enterprise Manager Common Core Files 10.2.0.4.4
      Oracle DBCA Deconfiguration 11.2.0.3.0
      Oracle RAC Deconfiguration 11.2.0.3.0
      Oracle Quality of Service Management (Server) 11.2.0.3.0
      Installation Plugin Files 11.2.0.3.0
      Universal Storage Manager Files 11.2.0.3.0
      Oracle Text Required Support Files 11.2.0.3.0
      Automatic Storage Management Assistant 11.2.0.3.0
      Oracle Database 11g Multimedia Files 11.2.0.3.0
      Oracle Multimedia Java Advanced Imaging 11.2.0.3.0
      Oracle Globalization Support 11.2.0.3.0
      Oracle Multimedia Locator RDBMS Files 11.2.0.3.0
      Oracle Core Required Support Files 11.2.0.3.0
      Bali Share 1.1.18.0.0
      Oracle Database Deconfiguration 11.2.0.3.0
      Oracle Quality of Service Management (Client) 11.2.0.3.0
      Expat libraries 2.0.1.0.1
      Oracle Containers for Java 11.2.0.3.0
      Perl Modules 5.10.0.0.1
      Secure Socket Layer 11.2.0.3.0
      Oracle JDBC/OCI Instant Client 11.2.0.3.0
      Oracle Multimedia Client Option 11.2.0.3.0
      LDAP Required Support Files 11.2.0.3.0
      Character Set Migration Utility 11.2.0.3.0
      Perl Interpreter 5.10.0.0.2
      PL/SQL Embedded Gateway 11.2.0.3.0
      OLAP SQL Scripts 11.2.0.3.0
      Database SQL Scripts 11.2.0.3.0
      Oracle Extended Windowing Toolkit 3.4.47.0.0
      SSL Required Support Files for InstantClient 11.2.0.3.0
      SQL*Plus Files for Instant Client 11.2.0.3.0
      Oracle Net Required Support Files 11.2.0.3.0
      Oracle Database User Interface 2.2.13.0.0
      RDBMS Required Support Files for Instant Client 11.2.0.3.0
      RDBMS Required Support Files Runtime 11.2.0.3.0
      XML Parser for Java 11.2.0.3.0
      Oracle Security Developer Tools 11.2.0.3.0
      Oracle Wallet Manager 11.2.0.3.0
      Enterprise Manager plugin Common Files 11.2.0.3.0
      Platform. Required Support Files 11.2.0.3.0
      Oracle JFC Extended Windowing Toolkit 4.2.36.0.0
      RDBMS Required Support Files 11.2.0.3.0
      Oracle Ice Browser 5.2.3.6.0
      Oracle Help For Java 4.2.9.0.0
      Enterprise Manager Common Files 10.2.0.4.3
      Deinstallation Tool 11.2.0.3.0
      Oracle Java Client 11.2.0.3.0
      Cluster Verification Utility Files 11.2.0.3.0
      Oracle Notification Service (eONS) 11.2.0.3.0
      Oracle LDAP administration 11.2.0.3.0
      Cluster Verification Utility Common Files 11.2.0.3.0
      Oracle Clusterware RDBMS Files 11.2.0.3.0
      Oracle Locale Builder 11.2.0.3.0
      Oracle Globalization Support 11.2.0.3.0
      Buildtools Common Files 11.2.0.3.0
      Oracle RAC Required Support Files-HAS 11.2.0.3.0
      SQL*Plus Required Support Files 11.2.0.3.0
      XDK Required Support Files 11.2.0.3.0
      Agent Required Support Files 10.2.0.4.3
      Parser Generator Required Support Files 11.2.0.3.0
      Precompiler Required Support Files 11.2.0.3.0
      Installation Common Files 11.2.0.3.0
      Required Support Files 11.2.0.3.0
      Oracle JDBC/THIN Interfaces 11.2.0.3.0
      Oracle Multimedia Locator 11.2.0.3.0
      Oracle Multimedia 11.2.0.3.0
      HAS Common Files 11.2.0.3.0
      Assistant Common Files 11.2.0.3.0
      PL/SQL 11.2.0.3.0
      HAS Files for DB 11.2.0.3.0
      Oracle Recovery Manager 11.2.0.3.0
      Oracle Database Utilities 11.2.0.3.0
      Oracle Notification Service 11.2.0.3.0
      SQL*Plus 11.2.0.3.0
      Oracle Netca Client 11.2.0.3.0
      Oracle Net 11.2.0.3.0
      Oracle JVM 11.2.0.3.0
      Oracle Internet Directory Client 11.2.0.3.0
      Oracle Net Listener 11.2.0.3.0
      Cluster Ready Services Files 11.2.0.3.0
      Oracle Database 11g 11.2.0.3.0
-----------------------------------------------------------------------------


实例化脚本以添加节点 (2012年4月13日 星期五 下午06时10分02秒 CST)
.                                                                 1% 已完成。
实例化添加节点脚本完成

复制到远程节点 (2012年4月13日 星期五 下午06时10分05秒 CST)
.WARNING:根据排除文件列表 '/tmp/OraInstall2012-04-13_06-09-53PM/installExcludeFile.lst' 将目录 /u01/app/11.2.0/grid 复制到节点 'niccl01' 时出错。[command-line: line 0: Bad configuration option: PermitLocalCommandlost connection :failed]
有关详细信息, 请参阅 '/u01/app/oraInventory/logs/addNodeActions2012-04-13_06-09-53PM.log'。您可以在必需的远程节点上修复错误。有关错误恢复信息, 请参阅安装指南。
                                                                 2% 已完成。
主目录已复制到新节点
..............................................................................................WARNING:已在此会话中的一个或多个节点上创建了新的产品清单。但是, 尚未将它注册为此系统的主产品清单。
要注册新的产品清单, 请以 root 权限在节点 'niccl01' 上的 '/u01/app/oraInventory/orainstRoot.sh' 中运行脚本。
如果不注册产品清单, 可能无法更新所安装的产品或对其打补丁。
以下配置脚本需要以各个新集群节点中的 "root" 用户的身份执行。以下列表中的每个脚本后面跟随一个节点列表。
/u01/app/oraInventory/orainstRoot.sh #On nodes niccl01
要执行配置脚本, 请执行以下操作:
    1. 打开终端窗口
    2. 以 "root" 身份登录
    3. 在每个集群节点中运行脚本

/u01/app/11.2.0/grid 的 添加集群节点 未成功。
请查看 '/tmp/silentInstall.log' 以获取详细资料。

        通过以上的内容可以看出,预检查过程是被忽略了,但是执行过程立即就被终止,并且有明显的和复制相关的报错,大家都知道,安装或添加节点的时候,Oracle会利用系统的scp命令来将本地文件复制到远程服务器。

5.检验scp工具工作是否正常。
[grid@niccl02 grid]$ scp root.sh grid@niccl01:/u01/app/11.2.0/grid
command-line: line 0: Bad configuration option: PermitLocalCommand
lost connection
       执行了一个scp命令的测试,发现报错的内容和上面日志的报错内容一致,可以确定是系统的scp工具出现了问题。

6.卸载ssh
        通过资料的查找,未找到解决这个问题的办法,最终将系统的ssh包卸载:
[root@niccl02 ssh]# rpm -e openssh openssh-clients openssh-server openssh-askpass
warning: /etc/ssh/ssh_config saved as /etc/ssh/ssh_config.rpmsave

7.重新安装ssh
        执行以下的命令重新安装ssh:
[root@niccl02 soft]# mount -o loop rhel-server-5.5-x86_64-dvd.iso /mnt/
[root@niccl02 soft]# cd /mnt/Server
[root@niccl02 Server]# rpm -ivh openssh-clients-4.3p2-41.el5.x86_64.rpm openssh-server-4.3p2-41.el5.x86_64.rpm openssh-askpass-4.3p2-41.el5.x86_64.rpm openssh-4.3p2-41.el5.x86_64.rpm
Preparing...                ########################################### [100%]
   1:openssh                ########################################### [ 25%]
   2:openssh-clients        ########################################### [ 50%]
   3:openssh-server         ########################################### [ 75%]
   4:openssh-askpass        ########################################### [100%]

8.再次测试scp工具。
[root@niccl02 ssh]# cd /tmp
[root@niccl02 tmp]# touch a
[root@niccl02 tmp]# scp a root@niccl01:/tmp
Enter passphrase for key '/root/.ssh/id_rsa':
a                               


[grid@niccl02 grid]$ scp root.sh grid@niccl01:/u01/app/11.2.0/grid
root.sh                                                                                                                             100%  467     0.5KB/s   00:00
        通过以上的两段输出可以看出,scp工具已经恢复了正常。
        之后再次执行前面的节点添加命令,验证顺利通过,且之后的节点添加过程非常顺利的完成了。

结论:
        系统级别的ssh相关工具(主要是scp工具)对于RAC的安装、节点的添加都非常重要,大家要熟悉他的使用,以便在出现问题的时候,更容易排出问题。

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/23135684/viewspace-721189/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/23135684/viewspace-721189/

 类似资料: