当前位置: 首页 > 知识库问答 >
问题:

无法运行mapreduce wordcount

吕成业
2023-03-14

我正在尝试自学一些hadoop基础知识,因此已经构建了一个简单的hadoop集群。这样可以工作,并且我可以从hdfs文件系统中put,ls,cat而没有任何问题。所以我采取了下一步,尝试对我放入hadoop的文件进行单词计数,但我得到了以下错误

 $ hadoop jar /home/hadoop/share/hadoop/mapreduce/*examples*.jar wordcount     data/sectors.txt results
2018-06-06 07:57:36,936 INFO client.RMProxy: Connecting to ResourceManager     at ansdb1/10.49.17.12:8040
2018-06-06 07:57:37,404 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hadoop/.staging/job_1528191458385_0014
2018-06-06 07:57:37,734 INFO input.FileInputFormat: Total input files to process : 1
2018-06-06 07:57:37,869 INFO mapreduce.JobSubmitter: number of splits:1
2018-06-06 07:57:37,923 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2018-06-06 07:57:38,046 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1528191458385_0014
2018-06-06 07:57:38,048 INFO mapreduce.JobSubmitter: Executing with tokens: []
2018-06-06 07:57:38,284 INFO conf.Configuration: resource-types.xml not found
2018-06-06 07:57:38,284 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2018-06-06 07:57:38,382 INFO impl.YarnClientImpl: Submitted application application_1528191458385_0014
2018-06-06 07:57:38,445 INFO mapreduce.Job: The url to track the job: http://ansdb1:8088/proxy/application_1528191458385_0014/
2018-06-06 07:57:38,446 INFO mapreduce.Job: Running job: job_1528191458385_0014
2018-06-06 07:57:45,499 INFO mapreduce.Job: Job job_1528191458385_0014 running in uber mode : false
2018-06-06 07:57:45,501 INFO mapreduce.Job:  map 0% reduce 0%
2018-06-06 07:57:45,521 INFO mapreduce.Job: Job job_1528191458385_0014 failed with state FAILED due to: Application application_1528191458385_0014 failed 2 times due to AM Container for appattempt_1528191458385_0014_000002 exited with  exitCode: 1
Failing this attempt.Diagnostics: [2018-06-06 07:57:43.301]Exception from container-launch.
Container id: container_1528191458385_0014_02_000001
Exit code: 1

[2018-06-06 07:57:43.304]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster

Please check whether your etc/hadoop/mapred-site.xml contains the below configuration:
<property>
  <name>yarn.app.mapreduce.am.env</name>
  <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
  <name>mapreduce.map.env</name>
  <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
  <name>mapreduce.reduce.env</name>
  <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>

[2018-06-06 07:57:43.304]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster

Please check whether your etc/hadoop/mapred-site.xml contains the below configuration:
<property>
  <name>yarn.app.mapreduce.am.env</name>
  <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
  <name>mapreduce.map.env</name>
  <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
  <name>mapreduce.reduce.env</name>
  <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>

For more detailed output, check the application tracking page: http://ansdb1:8088/cluster/app/application_1528191458385_0014 Then click on links to logs of each attempt.
. Failing the application.
2018-06-06 07:57:45,558 INFO mapreduce.Job: Counters: 0
$ jps
31858 ResourceManager
31544 SecondaryNameNode
6152 Jps
31275 DataNode
31132 NameNode
$ ssh ansdb2 jps
16615 NodeManager
21290 Jps
16478 DataNode

我可以ls Hadoop:

$ hadoop fs -ls /
Found 3 items
drwxrwxrwt   - hadoop supergroup          0 2018-06-06 07:58 /tmp
drwxr-xr-x   - hadoop supergroup          0 2018-06-05 11:46 /user
drwxr-xr-x   - hadoop supergroup          0 2018-06-05 07:50 /usr

hadoop版本:

$ hadoop version
Hadoop 3.1.0
Source code repository https://github.com/apache/hadoop -r 16b70619a24cdcf5d3b0fcf4b58ca77238ccbe6d
Compiled by centos on 2018-03-30T00:00Z
Compiled with protoc 2.5.0
From source with checksum 14182d20c972b3e2105580a1ad6990
This command was run using /home/hadoop/share/hadoop/common/hadoop-common-3.1.0.jar

hadoop类路径:

$ hadoop classpath
/home/hadoop/etc/hadoop:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/share/hadoop/common/*:/home/hadoop/share/hadoop/hdfs:/home/hadoop/share/hadoop/hdfs/lib/*:/home/hadoop/share/hadoop/hdfs/*:/home/hadoop/share/hadoop/mapreduce/*:/home/hadoop/share/hadoop/yarn:/home/hadoop/share/hadoop/yarn/lib/*:/home/hadoop/share/hadoop/yarn/*
# hadoop
## JAVA env variables
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.171-7.b10.el7.x86_64
export CLASSPATH=.:$JAVA_HOME/jre/lib:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar

## HADOOP env variables
export HADOOP_HOME=/home/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export YARN_HOME=$HADOOP_HOME
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME
export HADOOP_LIBEXEC_DIR=$HADOOP_HOME/libexec
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_INSTALL=$HADOOP_HOME

PATH=$PATH:$JAVA_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
$ cat $HADOOP_HOME/etc/hadoop/core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
 <property>
  <name>fs.defaultFS</name>
  <value>hdfs://ansdb1:9000/</value>
 </property>
</configuration>
$ cat $HADOOP_HOME/etc/hadoop/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
 <property>
  <name>dfs.data.dir</name>
  <value>/data/hadoop/datanode</value>
 </property>
 <property>
  <name>dfs.name.dir</name>
  <value>/data/hadoop/namenode</value>
 </property>
 <property>
  <name>dfs.checkpoint.dir</name>
  <value>/data/hadoop/secondarynamenode</value>
 </property>
 <property>
  <name>dfs.replication</name>
  <value>2</value>
 </property>
</configuration>
$ cat $HADOOP_HOME/etc/hadoop/yarn-site.xml
<?xml version="1.0"?>
<configuration>
 <property>
  <name>yarn.resourcemanager.hostname</name>
  <value>ansdb1</value>
 </property>
 <property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
 </property>
 <property>
  <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
  <value>org.apache.hadoop.mapred.ShuffleHandler</value>
 </property>
 <property>
  <name>yarn.resourcemanager.resource-tracker.address</name>
  <value>ansdb1:8025</value>
 </property>
 <property>
  <name>yarn.resourcemanager.scheduler.address</name>
  <value>ansdb1:8030</value>
 </property>
 <property>
  <name>yarn.resourcemanager.address</name>
  <value>ansdb1:8040</value>
 </property>
</configuration>
$ cat $HADOOP_HOME/etc/hadoop/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
 <property>
  <name>mapreduce.framework.name</name>
  <value>yarn</value>
 </property>
</configuration>
$ find /home/hadoop -name '*.jar' -exec grep -Hls MRAppMaster {} \;
/home/hadoop/share/hadoop/mapreduce/sources/hadoop-mapreduce-client-app-3.1.0-sources.jar
/home/hadoop/share/hadoop/mapreduce/sources/hadoop-mapreduce-client-app-3.1.0-test-sources.jar
/home/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-app-3.1.0.jar 

很明显我错过了什么,所以谁能给我指出正确的方向。

共有1个答案

司寇琨
2023-03-14

在搜索了很多相同的问题后,我找到了https://mathsigit.github.io/blog_page/2017/11/16/hole-of-submitting-mr-of-hadoop300rc0/(中文)。因此我在mapred-site.xml中设置了以下属性

<property>
 <name>yarn.app.mapreduce.am.env</name>
 <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
 <name>mapreduce.map.env</name>
 <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
 <name>mapreduce.reduce.env</name>
 <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>

一切正常

 类似资料:
  • 我在Windows上安装了Spark,我无法启动。当我输入时,我得到以下错误: Python 3.6.0 | Anaconda custom(64位)|(默认值,2016年12月23日,11:57:41)[MSC v.1900 64位(AMD64)]在win32上键入“帮助”、“版权”、“信用”或“许可证”以了解更多信息。回溯(最近一次调用):文件“c:\Spark\bin..\python\py

  • 问题内容: 我已经在mongodb服务器中安装了mongo-connector。 我通过发出命令来执行 我还尝试了此操作,因为mongo在具有默认端口的同一服务器上运行。 我遇到错误 注意:我正在使用python2.7和mongo-connector 2.3 elasticsearch服务器是2.2 有什么建议 ? [edit] 应用的建议后: 2016-02-29 19:56:59,519 [C

  • 我知道这个问题已经张贴,但我没有找到任何可以帮助我解决我的问题。我想用AnimeJS在我的网站上创建一些动画,但我无法运行最简单的动画,即使是anime.js文档和示例中的动画。 我找到了有相同问题的人,并通过在文档加载后运行代码解决了这个问题,但在我的例子中,这似乎不是解决方案n。我使用npm安装了anime.js,在我的页面上使用的无非是这个和Jquery。下面是文档示例的代码: null n

  • 系统环境: JVM 11.0.6-打开 Ubuntu 18.04 SDKMAN 5.8.1+484 等级6.1 Chrome版本81.0.4044.138 Chrome驱动程序3.141.59 要运行的代码: 下午4:52:09:正在执行任务“Runner.Main()”... 任务:CompileJava任务:ProcessResources无源任务:类 任务:runner.main()错误:未

  • 运行start.jar时,出现以下错误: 谢谢,山姆

  • 我的笔记本电脑上有一个java项目,我正在用Gradle构建它。所有依赖项都在文件系统中,因为我在处理它时大部分时间都是脱机的。反正也不算太多。 查看checkstyle-6.10.1.jar内部,我可以看到没有这样的类,而是有一个叫做的类,我怀疑这是gradle应该调用的类。然而,我不知道如何让gradle调用它。 我唯一的一个怀疑是我的没有正确定义,并且gradle使用某些默认值调用。然而,所