问题：

MapReduce作业挂起

夏季萌

2023-03-14

我是Hadoop的MapReduce的新手。我已经编写了一个map-reduce任务，我正在尝试在本地计算机上运行它。但这项工作在地图绘制完成后就悬而未决了。

下面是代码，我不明白我错过了什么。

我有一个自定义密钥类

import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.WritableComparable;

public class AirlineMonthKey implements WritableComparable<AirlineMonthKey>{

Text airlineName;
Text month;

public AirlineMonthKey(){
    super();
}

public AirlineMonthKey(Text airlineName, Text month) {
    super();
    this.airlineName = airlineName;
    this.month = month;
}

public Text getAirlineName() {
    return airlineName;
}

public void setAirlineName(Text airlineName) {
    this.airlineName = airlineName;
}

public Text getMonth() {
    return month;
}

public void setMonth(Text month) {
    this.month = month;
}

@Override
public void readFields(DataInput in) throws IOException {
    // TODO Auto-generated method stub
    this.airlineName.readFields(in);
    this.month.readFields(in);
}

@Override
public void write(DataOutput out) throws IOException {
    // TODO Auto-generated method stub
    this.airlineName.write(out);
    this.month.write(out);      
}

@Override
public int compareTo(AirlineMonthKey airlineMonthKey) {
    // TODO Auto-generated method stub
    int diff = getAirlineName().compareTo(airlineMonthKey.getAirlineName());
    if(diff != 0){
        return diff;
    }

    int m1 = Integer.parseInt(getMonth().toString());
    int m2 = Integer.parseInt(airlineMonthKey.getMonth().toString());

    if(m1>m2){
        return -1;
    }
    else 
        return 1;
}


}

使用自定义键的映射器和缩减器类如下。

package com.mapresuce.secondarysort;

import java.io.IOException;
import java.io.StringReader;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;

import com.opencsv.CSVReader;

public class FlightDelayByMonth {

public static class FlightDelayByMonthMapper extends
        Mapper<Object, Text, AirlineMonthKey, Text> {
    public void map(Object key, Text value, Context context)
            throws IOException, InterruptedException {
        String str = value.toString();
        // Reading Line one by one from the input CSV.
        CSVReader reader = new CSVReader(new StringReader(str));
        String[] split = reader.readNext();
        reader.close();

        String airlineName = split[6];
        String month = split[2];
        String year = split[0];
        String delayMinutes = split[37];
        String cancelled = split[41];

        if (!(airlineName.equals("") || month.equals("") || delayMinutes
                .equals(""))) {
            if (year.equals("2008") && cancelled.equals("0.00")) {
                AirlineMonthKey airlineMonthKey = new AirlineMonthKey(
                        new Text(airlineName), new Text(month));
                Text delay = new Text(delayMinutes);
                context.write(airlineMonthKey, delay);
                System.out.println("1");
            }
        }

    }
}

public static class FlightDelayByMonthReducer extends
        Reducer<AirlineMonthKey, Text, Text, Text> {


    public void reduce(AirlineMonthKey key, Iterable<Text> values,
            Context context) throws IOException, InterruptedException {
        for(Text val : values){
            context.write(new Text(key.getAirlineName().toString()+" "+key.getMonth().toString()), val);
        }
    }
}

public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {   
    Configuration conf = new Configuration();
    String[] otherArgs = new GenericOptionsParser(conf, args)
            .getRemainingArgs();
    if (otherArgs.length != 2) {
        System.err.println("Usage:<in> <out>");
        System.exit(2);
    }
    Job job = new Job(conf, "Average monthly flight dealy");
    job.setJarByClass(FlightDelayByMonth.class);
    job.setMapperClass(FlightDelayByMonthMapper.class);
    job.setReducerClass(FlightDelayByMonthReducer.class);
    job.setOutputKeyClass(AirlineMonthKey.class);
    job.setOutputValueClass(Text.class);
    FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
    FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}

我还在main中创建了一个作业和配置。不知道我错过了什么。我在本地环境下运行这一切。

共有2个答案

葛鸿熙

2023-03-14

问题是我必须使用AirlineMonthKey中的默认构造函数（我这样做了）并初始化自定义密钥类中的实例变量（我没有这样做）。

羊柏

2023-03-14

尝试在AirlineMonthKey类中编写toString、equals和hashcode的自定义实现。

阅读下面的链接。

http://Hadoop . Apache . org/docs/stable/API/org/Apache/Hadoop/io/writable comparable . html

对于键类型来说，实现hashCode()很重要。

希望这能帮到你。

类似资料：

MapReduce作业挂起，等待分配AM容器

我尝试将简单单词计数作为MapReduce作业运行。在本地运行时，一切工作都很好（所有工作都在Name节点上完成）。但是，当我尝试使用YARN在集群上运行它时（将=添加到mapred-site.conf)，作业会挂起。我在这里遇到了一个类似的问题：MapReduce作业陷入接受状态作业输出：会有什么问题？编辑：我在机器上尝试了这个配置（评论）:NameNode(8GB RAM)+2x D
hadoop mapreduce作业不运行reducer

我试图运行WordCount示例的一个变体，这个变体是，映射器输出文本作为键和文本作为值，而还原器输出文本作为键和NullWritable作为值。除了地图，减少签名，我把主要的方法是这样的：
Hadoop MapReduce作业频率最高

我试图使用这里定义的基本字数。当IntSumReducer执行context.write时，是否可能将该context.write传递给第二个reducer或输出类，该reducer或输出类将IntSumReducer给出的最终列表减少/更改到单个最大频率？我对Hadoop/MapReduce和Java中的jobs概念相当陌生，所以我不确定我需要如何修改默认的WordCount以使其符合要求。我
MapReduce作业中的多个键

我是hadoop新手。我正在尝试运行MapReduce（Java编程），其中键是多键（文本）。我应该使用哪种类型的数组来存储这些键，然后将它们作为参数传递给outputCollector？谢谢！！！
mapreduce作业中的“合并器”类

合并器在映射器之后、缩减器之前运行，它将接收由给定节点上的映射器实例发出的所有数据作为输入。然后输出到减速器。而且，如果一个化简函数既是可交换的又是结合的，那么它可以用作组合器。我的问题是，在这种情况下，“交换和结合”这个短语是什么意思？
登录MapReduce作业的标准做法

问题内容：我正在尝试找到登录MapReduce作业的最佳方法。我在其他Java应用程序中将slf4j与log4j附加程序一起使用，但是由于MapReduce作业在整个群集中以分布式方式运行，所以我不知道应该在哪里设置日志文件的位置，因为它是访问受限的共享群集特权。是否有用于登录MapReduce作业的标准实践，因此您可以在作业完成后轻松查看集群中的日志？问题答案：您可以使用log4j，这是

MapReduce作业挂起

共有2个答案

相关问答

相关文章

相关阅读

相关工具

相关文档