当前位置: 首页 > 知识库问答 >
问题:

单节点Ubuntu集群上的RHadoop作业失败

姬心思
2023-03-14

当我们使用包“RMR”(RHadoop的一部分)并从RStudio中运行的R程序内部调用mapreduce()时,就会出现错误。

为了简化这篇文章,我展示了一个失败的非常简单的程序(其他更大的程序以相同的错误消息失败)

Sys.setenv(HADOOP_CMD="/usr/local/hadoop220/bin/hadoop")
Sys.setenv(HADOOP_STREAMING="/usr/local/hadoop220/share/hadoop/tools/lib/hadoop-streaming-2.2.0.jar")
library('rhdfs')
library('rmr2')
hdfs.init()
hdfs.ls("/user/hduser")
small.ints = to.dfs(1:1000)
mapreduce(
  input = small.ints, 
  map = function(k, v) cbind(v, v^2))

在R-Studio控制台上显示的错误有

> Sys.setenv(HADOOP_CMD="/usr/local/hadoop220/bin/hadoop")
> Sys.setenv(HADOOP_STREAMING="/usr/local/hadoop220/share/hadoop/tools/lib/hadoop-streaming-2.2.0.jar")
> library('rhdfs')
Loading required package: rJava

HADOOP_CMD=/usr/local/hadoop220/bin/hadoop

Be sure to run hdfs.init()
> library('rmr2')
Loading required package: Rcpp
Loading required package: RJSONIO
Loading required package: bitops
Loading required package: digest
Loading required package: functional
Loading required package: reshape2
Loading required package: stringr
Loading required package: plyr
Loading required package: caTools
> hdfs.init()
14/05/10 14:20:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> hdfs.ls("/user/hduser")
  permission  owner      group size          modtime                 file
1 drwxr-xr-x hduser supergroup    0 2014-05-07 17:44      /user/hduser/BT
2 drwxr-xr-x hduser supergroup    0 2014-05-09 07:14  /user/hduser/BT-out
3 drwxr-xr-x hduser supergroup    0 2014-05-09 20:30 /user/hduser/BTR-out
4 drwxr-xr-x hduser supergroup    0 2014-05-07 17:44  /user/hduser/BTj-in
> small.ints = to.dfs(1:1000)
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /usr/local/hadoop220/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
14/05/10 14:20:50 WARN util.NativeCodeLoader: ... using builtin-java classes where applicable

[ these two messages repeat multiple times ]

> mapreduce(
+   input = small.ints, 
+   map = function(k, v) cbind(v, v^2))

Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /usr/local/hadoop220/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
14/05/10 14:21:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

14/05/10 14:21:20 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.

packageJobJar: [/tmp/RtmpYCerEW/rmr-local-env282d4c7a3b53, /tmp/RtmpYCerEW/rmr-global-env282d77c9da92, /tmp/RtmpYCerEW/rmr-streaming-map282d4225651a, /tmp/hadoop-hduser/hadoop-unjar678942474363050554/] [] /tmp/streamjob8073315154972274831.jar tmpDir=null
14/05/10 14:21:21 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/05/10 14:21:21 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/05/10 14:21:22 INFO mapred.FileInputFormat: Total input paths to process : 1
14/05/10 14:21:22 INFO mapreduce.JobSubmitter: number of splits:2
14/05/10 14:21:22 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/05/10 14:21:22 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/05/10 14:21:22 INFO Configuration.deprecation: mapred.cache.files.filesizes is deprecated. Instead, use mapreduce.job.cache.files.filesizes
14/05/10 14:21:22 INFO Configuration.deprecation: mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files
14/05/10 14:21:22 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/05/10 14:21:22 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/05/10 14:21:22 INFO Configuration.deprecation: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
14/05/10 14:21:22 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/05/10 14:21:22 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/05/10 14:21:22 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/05/10 14:21:22 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/05/10 14:21:22 INFO Configuration.deprecation: mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps
14/05/10 14:21:22 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/05/10 14:21:22 INFO Configuration.deprecation: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
14/05/10 14:21:22 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/05/10 14:21:23 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1399709731242_0003
14/05/10 14:21:23 INFO impl.YarnClientImpl: Submitted application application_1399709731242_0003 to ResourceManager at /0.0.0.0:8032
14/05/10 14:21:23 INFO mapreduce.Job: The url to track the job: http://yantrajaal:8088/proxy/application_1399709731242_0003/
14/05/10 14:21:23 INFO mapreduce.Job: Running job: job_1399709731242_0003
14/05/10 14:21:30 INFO mapreduce.Job: Job job_1399709731242_0003 running in uber mode : false
14/05/10 14:21:30 INFO mapreduce.Job:  map 0% reduce 0%
14/05/10 14:21:43 INFO mapreduce.Job: Task Id : attempt_1399709731242_0003_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

14/05/10 14:21:44 INFO mapreduce.Job: Task Id : attempt_1399709731242_0003_m_000001_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

14/05/10 14:22:04 INFO mapreduce.Job: Task Id : attempt_1399709731242_0003_m_000000_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143

14/05/10 14:22:04 INFO mapreduce.Job: Task Id : attempt_1399709731242_0003_m_000001_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

14/05/10 14:22:17 INFO mapreduce.Job: Task Id : attempt_1399709731242_0003_m_000001_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

14/05/10 14:22:17 INFO mapreduce.Job: Task Id : attempt_1399709731242_0003_m_000000_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

14/05/10 14:22:26 INFO mapreduce.Job:  map 100% reduce 0%
14/05/10 14:22:26 INFO mapreduce.Job: Job job_1399709731242_0003 failed with state FAILED due to: Task failed task_1399709731242_0003_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0

14/05/10 14:22:26 INFO mapreduce.Job: Counters: 10
    Job Counters 
        Failed map tasks=7
        Killed map tasks=1
        Launched map tasks=8
        Other local map tasks=6
        Data-local map tasks=2
        Total time spent by all maps in occupied slots (ms)=91997
        Total time spent by all reduces in occupied slots (ms)=0
    Map-Reduce Framework
        CPU time spent (ms)=0
        Physical memory (bytes) snapshot=0
        Virtual memory (bytes) snapshot=0
14/05/10 14:22:26 ERROR streaming.StreamJob: Job not Successful!
Streaming Command Failed!
Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce,  : 
  hadoop streaming failed with error code 1
> 
14/05/10 14:21:43 INFO mapreduce.Job: Task Id : attempt_1399709731242_0003_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

共有1个答案

方增
2023-03-14

我通过更改rmr2,rhdfs,...包裹。基本上,您需要将所有的包安装在一个系统文件夹中,而不是自定义文件夹中。安装的位置似乎有问题。最初,我将软件包安装在一个自定义文件夹中:

/home/user/R/3.1

通过在以下位置重新安装软件包:

/usr/lib/R/library

我把代码弄好了。

 类似资料:
  • Disque 以集群模式运行, 每个服务器都是集群中的一个节点, 用户可以运行任意数量的节点, 只要确保每个节点的端口号不同即可。 在默认情况下, 运行 Disque 服务器程序 disque-server 将启动一个端口号为 7711 的 Disque 节点: $ ./disque-server 528:C 28 Apr 11:50:08.519 # Warning: no config fil

  • 不幸的是,我有一个工作是对RAM中的数据进行操作,但没有同步设置。我能看到的最简单的解决方案是让一个作业在所有节点上运行而不进行协调,就像使用一样。 是否有方法将作业配置为在LocalDataSourceJobStore下的所有节点上运行? 精确的定时并不重要,但作业必须每30分钟在每个节点上运行一次

  • 问题内容: 我想用一个非常简单的单节点群集启动Cassandra,但是我做不到。 我按照在 https://www.digitalocean.com/community/tutorials/how-to-install-cassandra- and-run-a-single-node-cluster-on-a-ubuntu- vps 基本上, 在VirtualBox上构建了一个新的CentOS 7

  • 我计划部署Kafka集群。我有以下查询: 1)为了保护生产者和消费者与Kafka broker的通信,可以使用SSL。如果我有一个由9个代理和3个zookeeper节点组成的集群,并且如果我不想使用自签名证书,我是否必须为每个节点购买一个证书(9个3证书,成本太高)? 正如我所读到的,生产者/消费者直接联系其中一个经纪人节点,而不联系动物园管理员。 谢谢, 病毒的

  • 问题内容: 我尝试在Google Container Engine的群集节点上安装ElasticSearch(最新版本),但是ElasticSearch需要变量:>> 262144。 如果我ssh到每个节点并手动运行: 一切正常,但是任何新节点将没有指定的配置。 所以我的问题是: 有没有办法在引导时在每个节点上加载系统配置?Deamon Set并不是一个好的解决方案,因为在Docker容器中,系统

  • 问题内容: 我在集群环境中将Quartz Scheduler用作Spring bean。 我有一些用@NotConcurrent注释的作业,它们每个集群运行一次(即,仅在一个节点中,仅在一个线程中)。 现在,我需要在集群的每个节点上运行一项作业。我删除了@NotConcurrent批注,但是它仅在一台计算机上的每个线程上运行。它不会在其他节点上触发。 我应该用什么来注释作业? 示例:带注释的Job