当前位置: 首页 > 知识库问答 >
问题:

使用联接时配置单元中的行异常

滑弘扬
2023-03-14
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=1) {"key":{"joinkey0":"12"},"value":{"_col2":"rs317647905"},"alias":1}
        at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:270)
        at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
        at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=1) {"key":{"joinkey0":"12"},"value":{"_col2":"rs317647905"},"alias":1}
        at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:258)
        ... 7 more
Caused by: org.apache.hadoop.
create table table_llv_N_C as select table_line_n_passed.chromosome_number,table_line_n_passed.position,table_line_c_passed.id from table_line_n_passed join table_line_c_passed on (table_line_n_passed.chromosome_number=table_line_c_passed.chromosome_number)

hive> desc table_line_n_passed;
OK
chromosome_number       string

position        int
id      string
ref     string
alt     string
quality double
filter  string
info    string
format  string
line6   string
Time taken: 0.854 seconds

为什么我会得到这个错误,我该如何解决它?下面给出了完整的堆栈跟踪。

2015-03-09 10:19:09,347 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 7 forwarding 1797000000 rows
2015-03-09 10:19:09,919 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 6 forwarding 1798000000 rows
2015-03-09 10:19:09,919 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 7 forwarding 1798000000 rows
2015-03-09 10:19:10,495 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 6 forwarding 1799000000 rows
2015-03-09 10:19:10,495 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 7 forwarding 1799000000 rows
2015-03-09 10:19:11,069 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 6 forwarding 1800000000 rows
2015-03-09 10:19:11,069 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 7 forwarding 1800000000 rows
2015-03-09 10:19:11,644 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 6 forwarding 1801000000 rows
at org.apache.hadoop.ipc.Client.call(Client.java:1238)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy10.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:291)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1228)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1081)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:502)
at org.apache.hadoop.hive.ql.exec.JoinOperator.processOp(JoinOperator.java:134)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:249)
... 7 more

原因:org.apache.hadoop.hive.ql.metadata.hiveException:org.apache.hadoop.ipc.RemoteException(java.io.ioException):文件/tmp/hive-root/hive_2015-03-09_10-03-59_970_3646456754594156815-1/_task_tmp.-ext-10001/_tmp.000000_0只能复制到0个节点,而不是minReplication(=1)。有2个datanode正在运行,在此操作中不排除任何节点。在org.apache.hadoop.hdfs.server.blockmanagement.blockmanager.choosetarget(blockmanager.java:1361)在org.apache.hadoop.hdfs.server.namenode.fsnamesystem.getadditionalblock(fsnamesystem.getadditionalblock(fsnamesystem.java:2362)在org.apache.hadoop.hdfs.server.namenoderpcserver.addblock(namenoderpcserver.java:501)org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1438)位于org.apache.hadoop.ipc.server$handler.run(server.java:1754)

at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:620)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:803)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:803)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:742)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:745)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:847)
at org.apache.hadoop.hive.ql.exec.JoinOperator.processOp(JoinOperator.java:109)
... 9 more

原因:org.apache.hadoop.ipc.RemoteException(java.io.ioException):文件/tmp/hive-root/hive_2015-03-09_10-03-59_970_3646456754594156815-1/_task_tmp.-ext-10001/_tmp.000000_0只能复制到0个节点,而不是minReplication(=1)。有2个datanode正在运行,在此操作中不排除任何节点。在org.apache.hadoop.hdfs.server.blockmanagement.blockmanager.choosetarget(blockmanager.java:1361)在org.apache.hadoop.hdfs.server.namenode.fsnamesystem.getadditionalblock(fsnamesystem.getadditionalblock(fsnamesystem.java:2362)在org.apache.hadoop.hdfs.server.namenoderpcserver.addblock(namenoderpcserver.java:501)org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1438)位于org.apache.hadoop.ipc.server$handler.run(server.java:1754)

at org.apache.hadoop.ipc.Client.call(Client.java:1238)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy10.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:291)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1228)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1081)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:502)
at org.apache.hadoop.hive.ql.exec.JoinOperator.processOp(JoinOperator.java:134)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:249)
... 7 more

原因:org.apache.hadoop.hive.ql.metadata.hiveException:org.apache.hadoop.ipc.RemoteException(java.io.ioException):文件/tmp/hive-root/hive_2015-03-09_10-03-59_970_3646456754594156815-1/_task_tmp.-ext-10001/_tmp.000000_0只能复制到0个节点,而不是minReplication(=1)。有2个datanode正在运行,在此操作中不排除任何节点。在org.apache.hadoop.hdfs.server.blockmanagement.blockmanager.choosetarget(blockmanager.java:1361)在org.apache.hadoop.hdfs.server.namenode.fsnamesystem.getadditionalblock(fsnamesystem.getadditionalblock(fsnamesystem.java:2362)在org.apache.hadoop.hdfs.server.namenoderpcserver.addblock(namenoderpcserver.java:501)org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1438)位于org.apache.hadoop.ipc.server$handler.run(server.java:1754)

at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:620)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:803)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:803)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:742)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:745)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:847)

共有1个答案

蔡楚
2023-03-14

根本原因可能是HDFS集群中的磁盘空间不足,原因是查询似乎只在运行一段时间后才会失败,并结合来自堆栈跟踪的以下消息:

... could only be replicated to 0 nodes instead of minReplication (=1). There are 2 datanode(s) running and no node(s) are excluded in this operation.

当存在网络通信问题(例如,与数据节点的通信丢失)时,或者当HDFS由于无法找到具有空闲块的数据节点而无法为写操作提供服务时,该消息似乎会出现。由于您的查询确实成功启动,对我来说,这倾向于排除网络问题;相反,配置单元查询试图生成该表的磁盘空间似乎不足。您可能希望检查集群上当前的使用情况,这可以通过类似Ambari的方法(如果您安装了Ambari的话),或者通过命令行执行以下操作之一:

hdfs dfs -df -h

如果您运行的是较旧的版本,则可能是这样的:

hadoop fs -df -h
 类似资料:
  • 我已经将Teradata表的数据迁移到配置单元中。 如果我使用joins,我需要连接五个表,在hive中可以吗?或者我应该将查询分成五个部分?对于这个问题应该采取什么明智的方法? 请建议

  • 我在java中开发了一个工作正常的配置单元udf,我的函数返回输入与配置单元表中列之间的最佳匹配,因此它有以下简化的伪代码: 我的问题是,如果这个函数是由Hive调用的,为什么我需要在代码中连接到Hive?我可以使用使用我的功能的用户所连接的当前连接吗?

  • 然后错误显示在蜂巢: 失败:执行错误,从org.apache.hadoop.hive.ql.exec.mr.MapRedWork返回代码-101。org.apache.hadoop.mapreduce.v2.util.MRApps.setEnvFromInputProperties(Ljava/util/Map;Ljava/lang/String;Ljava/lang/String;Lorg/ap

  • 我试图在Hive0.14中执行HiveACID事务属性,比如通过Java插入、删除和更新。我能够设置所需的ACID事务属性。还可以创建具有事务属性的表。但它失败了。下面是示例代码: 尝试插入时获得以下异常: 线程“main”java中出现异常。sql。SQLException:处理语句时出错:失败:执行错误,从组织返回代码1。阿帕奇。hadoop。蜂箱ql.exec。org的MapRedTask先

  • 正如我所知,hive支持sql就像Multi-Select中的一个一样,我的sql是这样的: 我尝试一个查询多选择,但配置单元返回异常:“编译语句时出错:failed:SemanticException Exception在处理时异常” 有人知道为什么会这样吗?如何解决?谢谢,高尔。

  • 当我在配置单元中运行以下查询时: 这怎么解决呢? 谢谢