问题：

为什么Hadoop组合器输出没有被reducer合并

易波涛

2023-03-14

我运行了一个简单的wordcount MapReduce示例，在组合器输出中添加一个小变化的组合器，组合器的输出不被Reducer合并。场景如下

context.write(t，new IntWritable(1))；//添加了我自己的输出

public class wordcountcombiner extends Reducer<Text, IntWritable, Text, IntWritable>{

  @Override
  public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException
  {
    int sum = 0;
    for (IntWritable val : values)
    {
        sum += val.get();
    }
    context.write(key, new IntWritable(sum));
    Text t = new Text("different"); // Added my own output
    context.write(t, new IntWritable(1)); // Added my own output
  }
}

我运行了一个简单的wordcount MapReduce示例，在组合器输出中添加一个小变化的组合器，组合器的输出不被Reducer合并。场景如下：在组合器中，我添加了两个额外的行来输出一个不同的单词和计数1，还原器不是求和“不同”的单词计数。输出粘贴在下面。

"different" 1
different   1
different   1
I           2
different   1
In          1
different   1
MapReduce   1
different   1
The         1
different   1
...

怎么会这样呢？

public class WordCount {

  public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
    // TODO Auto-generated method stub

    Job job = Job.getInstance(new Configuration());
    job.setJarByClass(wordcountmapper.class);
    job.setJobName("Word Count");

    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

    job.setMapperClass(wordcountmapper.class);
    job.setCombinerClass(wordcountcombiner.class);
    job.setReducerClass(wordcountreducer.class);
    job.getConfiguration().set("fs.file.impl", "com.conga.services.hadoop.patch.HADOOP_7682.WinLocalFileSystem");       

    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);

    System.exit(job.waitForCompletion(true)? 0 : 1);

  }

}

public class wordcountmapper extends Mapper<LongWritable, Text, Text, IntWritable> {

  private Text word = new Text();
  IntWritable one = new IntWritable(1);
  @Override
  public void map(LongWritable key, Text value, Context context) 
        throws IOException, InterruptedException 
  {
    String line = value.toString();
    StringTokenizer token = new StringTokenizer(line);
    while (token.hasMoreTokens())
    {
        word.set(token.nextToken());
        context.write(word, one);
    }
  }
}

public class wordcountcombiner extends Reducer<Text, IntWritable, Text, IntWritable>{

  @Override
  public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException
  {
    int sum = 0;
    for (IntWritable val : values)
    {
        sum += val.get();
    }
    context.write(key, new IntWritable(sum));
    Text t = new Text("different");
    context.write(t, new IntWritable(1));
  }
}

减速器：

public class wordcountreducer extends Reducer<Text, IntWritable, Text, IntWritable>{

  @Override
  public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException
  {
    int sum = 0;
    for (IntWritable val : values)
    {
        sum += val.get();
    }
    context.write(key, new IntWritable(sum));
  }
}

薛经艺

2023-03-14

输出是正常的，因为您有两行做了错误的事情：为什么您有这段代码

Text t = new Text("different"); // Added my own output
context.write(t, new IntWritable(1)); // Added my own output

在减速器中，你要做求和，然后在输出中加上不同的1....

为什么Hadoop组合器输出没有被reducer合并

共有1个答案

相关问答

相关文章

相关阅读

相关工具

相关文档