当前位置: 首页 > 知识库问答 >
问题:

hadoop多节点群集-从节点无法执行mapreduce任务

禹德水
2023-03-14
hadoop/start-all.sh

jps输出正确:

在主机上:

li@master:~$ jps
12839 TaskTracker
11814 NameNode
12535 JobTracker
25131 Jps
12118 DataNode
12421 SecondaryNameNode

在5个从节点上:

li@slave1:~/hadoop/logs$ jps
4605 TaskTracker
19407 Jps
4388 DataNode
hadoop/stop-all.sh
hadoop jar hadoop-examples-1.2.1.jar wordcount /user/li/gutenberg /user/li/gutenberg-output
14/03/06 17:11:09 INFO input.FileInputFormat: Total input paths to process : 7
14/03/06 17:11:09 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/03/06 17:11:09 WARN snappy.LoadSnappy: Snappy native library not loaded
14/03/06 17:11:10 INFO mapred.JobClient: Running job: job_201402211607_0014
14/03/06 17:11:11 INFO mapred.JobClient:  map 0% reduce 0%
14/03/06 17:11:17 INFO mapred.JobClient:  map 14% reduce 0%
14/03/06 17:11:19 INFO mapred.JobClient:  map 57% reduce 0%
14/03/06 17:11:20 INFO mapred.JobClient:  map 85% reduce 0%
14/03/06 17:11:21 INFO mapred.JobClient:  map 100% reduce 0%
14/03/06 17:11:24 INFO mapred.JobClient:  map 100% reduce 33%
14/03/06 17:11:27 INFO mapred.JobClient:  map 100% reduce 100%
14/03/06 17:11:28 INFO mapred.JobClient: Job complete: job_201402211607_0014
14/03/06 17:11:28 INFO mapred.JobClient: Counters: 30
14/03/06 17:11:28 INFO mapred.JobClient:   Job Counters 
14/03/06 17:11:28 INFO mapred.JobClient:     Launched reduce tasks=1
14/03/06 17:11:28 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=38126
14/03/06 17:11:28 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
14/03/06 17:11:28 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
14/03/06 17:11:28 INFO mapred.JobClient:     Rack-local map tasks=2
14/03/06 17:11:28 INFO mapred.JobClient:     Launched map tasks=7
14/03/06 17:11:28 INFO mapred.JobClient:     Data-local map tasks=5
14/03/06 17:11:28 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=9825
14/03/06 17:11:28 INFO mapred.JobClient:   File Output Format Counters 
14/03/06 17:11:28 INFO mapred.JobClient:     Bytes Written=1412505
14/03/06 17:11:28 INFO mapred.JobClient:   FileSystemCounters
14/03/06 17:11:28 INFO mapred.JobClient:     FILE_BYTES_READ=4462568
14/03/06 17:11:28 INFO mapred.JobClient:     HDFS_BYTES_READ=6950792
14/03/06 17:11:28 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=7810309
14/03/06 17:11:28 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=1412505
14/03/06 17:11:28 INFO mapred.JobClient:   File Input Format Counters 
14/03/06 17:11:28 INFO mapred.JobClient:     Bytes Read=6950001
14/03/06 17:11:28 INFO mapred.JobClient:   Map-Reduce Framework
14/03/06 17:11:28 INFO mapred.JobClient:     Map output materialized bytes=2915072
14/03/06 17:11:28 INFO mapred.JobClient:     Map input records=137146
14/03/06 17:11:28 INFO mapred.JobClient:     Reduce shuffle bytes=2915072
14/03/06 17:11:28 INFO mapred.JobClient:     Spilled Records=507858
14/03/06 17:11:28 INFO mapred.JobClient:     Map output bytes=11435849
14/03/06 17:11:28 INFO mapred.JobClient:     Total committed heap usage (bytes)=1195069440
14/03/06 17:11:28 INFO mapred.JobClient:     CPU time spent (ms)=16520
14/03/06 17:11:28 INFO mapred.JobClient:     Combine input records=1174991
14/03/06 17:11:28 INFO mapred.JobClient:     SPLIT_RAW_BYTES=791
14/03/06 17:11:28 INFO mapred.JobClient:     Reduce input records=201010
14/03/06 17:11:28 INFO mapred.JobClient:     Reduce input groups=128513
14/03/06 17:11:28 INFO mapred.JobClient:     Combine output records=201010
14/03/06 17:11:28 INFO mapred.JobClient:     Physical memory (bytes) snapshot=1252454400
14/03/06 17:11:28 INFO mapred.JobClient:     Reduce output records=128513
14/03/06 17:11:28 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=4080599040
14/03/06 17:11:28 INFO mapred.JobClient:     Map output records=1174991
li@slave1:~/hadoop/logs$ cat hadoop-li-tasktracker-slave1.log
2014-03-06 17:11:46,335 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201402211607_0014_m_000003_0 task's state:UNASSIGNED
2014-03-06 17:11:46,335 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201402211607_0014_m_000004_0 task's state:UNASSIGNED
2014-03-06 17:11:46,335 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_201402211607_0014_m_000003_0 which needs 1 slots
2014-03-06 17:11:46,335 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201402211607_0014_m_000003_0 which needs 1 slots
2014-03-06 17:11:46,335 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_201402211607_0014_m_000004_0 which needs 1 slots
2014-03-06 17:11:46,336 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 1 and trying to launch attempt_201402211607_0014_m_000004_0 which needs 1 slots
2014-03-06 17:11:46,394 INFO org.apache.hadoop.mapred.JobLocalizer: Initializing user li on this TT.
2014-03-06 17:11:46,544 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201402211607_0014_m_-862426792
2014-03-06 17:11:46,544 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_201402211607_0014_m_-862426792 spawned.
2014-03-06 17:11:46,545 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201402211607_0014_m_-696634639
2014-03-06 17:11:46,547 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_201402211607_0014_m_-696634639 spawned.
2014-03-06 17:11:46,549 INFO org.apache.hadoop.mapred.TaskController: Writing commands to /home/li/hdfstmp/mapred/local/ttprivate/taskTracker/li/jobcache/job_201402211607_0014/attempt_201402211607_0014_m_000003_0/taskjvm.sh
2014-03-06 17:11:46,551 INFO org.apache.hadoop.mapred.TaskController: Writing commands to /home/li/hdfstmp/mapred/local/ttprivate/taskTracker/li/jobcache/job_201402211607_0014/attempt_201402211607_0014_m_000004_0/taskjvm.sh
2014-03-06 17:11:48,382 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_201402211607_0014_m_-862426792 given task: attempt_201402211607_0014_m_000003_0
2014-03-06 17:11:48,383 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_201402211607_0014_m_-696634639 given task: attempt_201402211607_0014_m_000004_0
2014-03-06 17:11:51,457 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201402211607_0014_m_000004_0 1.0% 
2014-03-06 17:11:51,459 INFO org.apache.hadoop.mapred.TaskTracker: Task attempt_201402211607_0014_m_000004_0 is done.
2014-03-06 17:11:51,460 INFO org.apache.hadoop.mapred.TaskTracker: reported output size for attempt_201402211607_0014_m_000004_0  was 217654
2014-03-06 17:11:51,460 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 1
2014-03-06 17:11:51,470 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201402211607_0014_m_000003_0 1.0% 
2014-03-06 17:11:51,472 INFO org.apache.hadoop.mapred.TaskTracker: Task attempt_201402211607_0014_m_000003_0 is done.
2014-03-06 17:11:51,472 INFO org.apache.hadoop.mapred.TaskTracker: reported output size for attempt_201402211607_0014_m_000003_0  was 267026
2014-03-06 17:11:51,473 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 2
2014-03-06 17:11:51,628 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_201402211607_0014_m_-696634639 exited with exit code 0. Number of tasks it ran: 1
2014-03-06 17:11:51,631 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_201402211607_0014_m_-862426792 exited with exit code 0. Number of tasks it ran: 1
2014-03-06 17:11:56,052 INFO org.apache.hadoop.mapred.TaskTracker.clienttrace: src: 192.168.1.111:50060, dest: 192.168.1.116:47652, bytes: 267026, op: MAPRED_SHUFFLE, cliID: attempt_201402211607_0014_m_000003_0, duration: 47537998
2014-03-06 17:11:56,076 INFO org.apache.hadoop.mapred.TaskTracker.clienttrace: src: 192.168.1.111:50060, dest: 192.168.1.116:47652, bytes: 217654, op: MAPRED_SHUFFLE, cliID: attempt_201402211607_0014_m_000004_0, duration: 15832312
2014-03-06 17:12:02,319 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_201402211607_0014
2014-03-06 17:12:02,320 INFO org.apache.hadoop.mapred.UserLogCleaner: Adding job_201402211607_0014 for user-log deletion with retainTimeStamp:1394233922320
2014-03-06 17:12:06,293 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_201402211607_0014
2014-03-06 17:12:06,293 WARN org.apache.hadoop.mapred.TaskTracker: Unknown job job_201402211607_0014 being deleted.
1. Why the 5 slave nodes did not get task assigned?
2. Why slave2,3 have different task logs from slave1,4,6 when I set the same configuration on them
3. Is this a multinode configuration problem? How can I solve it?

共有1个答案

景品
2023-03-14

任务节点似乎各有2个映射插槽:

2014-03-06 17:11:46,335 INFO org.apache.hadoop.mapred.tasktracker:在TaskLauncher中,当前空闲插槽:2个,正在尝试启动需要1个插槽的尝试_201402211607_0014_M_000003_0

JobTracker意识到了这一点,并决定将尽可能多的任务分配到单个节点上,而不是将它们分散到尽可能多的节点上。这样做可能是出于地区原因(以最小化网络流量)。

您的日志将根据特定节点上运行的任务而有所不同。因此,当您在集群上运行作业时,日志可能会迅速分散,而且它们不会均匀地分布在各个节点之间。

这种分配不均并不表示有任何问题;这是群集的正常行为。请记住,Hadoop通常是为批处理工作而设计的,这意味着通常情况下集群在运行许多作业时会被大量使用,这样即使特定的作业没有在所有节点上运行,您也不会得到空闲节点。

最后一点注意:在这个特定的例子中,您似乎得到了与您所遵循的教程不同的行为,因为您可能正在AWS上运行(使用弹性MapReduce)。很明显,EMR有一个自定义的调度程序,它可以自己做出这些映射决策(每个节点分配多少个插槽,以及如何在这些插槽上分配任务),而无需您对其进行配置。更多细节请参见本答案:Hadoop:基于集群大小的可用映射插槽数。

 类似资料:
  • 我正试图在hadoop中设置多节点集群,如何将0个数据阳极作为活动数据阳极,而我的hdfs显示了0个字节的分配 但是nodemanager后台进程正在datanodes上运行 `

  • 我有3个虚拟机。它们都有docker 1.12,并且在centos7上运行。所有端口都已打开,vm可以在我启动集群时相互ping Docker信息告诉我: 现在,我尝试将节点(其他VM)加入集群。我使用启动管理器后推荐的命令。 但我得到了: Docker信息告诉我: 集群管理器: 如何调试此问题,或者我是否忘记执行某些重要步骤?服务器之间是否需要ssh访问?谢谢 节点上的日志: 有时警告:

  • 我正在Azure中的Hadoop中创建一个多节点(1主3从)集群,我想所有的设置都已经完成了,但是当我运行一个测试文件时,它遇到了与Stackoverflow中的其他类似的问题,我尝试了他们的解决方案,但是,这个问题仍然无法解决。谁能帮帮我吗?我在这个问题上困了几天了 我检查了hadoop-hduser-namenode-master.log并得到了一些错误,似乎Damenode无法与nameno

  • 因为每个 Disque 节点都会将自己的配置信息储存在 disque-server 运行的文件夹里面, 而同一个文件夹只能有一份这样的配置信息, 所以如果我们打算同时运行多个节点, 那么就必须在不同的文件夹里面运行 disque-server , 并为每个节点指定不同的端口。 假设我们现在打算运行三个 Disque 节点, 那么首先要做的就是创建三个文件夹, 然后分别在这些文件夹里面运行 disq

  • 根据如何在Ubuntu中安装Apache Hadoop2.6.0(多节点/集群设置),我将Hadoop2.6.0设置为1个主服务器和2个从服务器。毕竟,我在master和slaves上检查了jps,看起来都很好:master上的NameNode、SecondaryNameNode、ResourceManager;和DataNode,从服务器上的NodeManager。但是当我浏览到Hadoopma

  • 需要一些帮助。虽然有很多不同的答案可用,我也尝试了他们,但不能使它工作。我在mac os中本地插入了hadoop,当我尝试编译java程序时,我得到了以下错误。我知道问题出在设置正确的类路径上,但在五月的情况下,提供类路径并没有使其工作。我已经在/usr/local/cellar/hadoop/1.2.1/libexec下安装了hadoop lineindexer.java:6:包org.apac