mapreduce八股文

八股文也稱制義、制藝、時文、八比文。而所謂的股，有對偶的意思。八股文有一套相對固定的寫作格式，其題目取自四書五經，以四書命題佔多數。
在這裏套用八股文的概念，mapreduce同樣有一種通用的模板框架，通過這個框架我們可以增添自己需要的業務代碼來實現現實業務的需求，本文以WordCount爲例。
話不多說，直接上代碼。


import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import java.io.IOException;
import java.util.StringTokenizer;

/**
 * MapReduce八股文模板
 * WordCount爲例
 *
 * @author Atrox
 * @data 2019/08/27
 */
public class MapReduceModel {


    /**
     * 業務map類 繼承Mapper類，重寫map方法
     */
    public static class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
        private Text mapOutputKey = new Text();
        private final static IntWritable mapOutputValue = new IntWritable(1);

        @Override
        public void map(LongWritable key, Text value, Context context)
                throws IOException, InterruptedException {
            StringTokenizer stringTokenizer = new StringTokenizer(value.toString());
            while (stringTokenizer.hasMoreTokens()) {
                String s = stringTokenizer.nextToken();
                mapOutputKey.set(s);
                context.write(mapOutputKey, mapOutputValue);
            }
        }
    }

    /**
     * 業務reduce類，繼承Reduce類，重寫reduce方法
     */
    public static class WordCountReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
        private static IntWritable outputValue = new IntWritable();

        @Override
        public void reduce(Text key, Iterable<IntWritable> values, Context context)
                throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable value : values) {
                sum += value.get();
            }
            outputValue.set(sum);
            context.write(key, outputValue);
        }
    }

    /**
     * mapreduce驅動
     */
    public int run(String[] args) throws Exception {
        //獲取Configuration對象
        Configuration configuration = new Configuration();
        configuration.set("mapreduce.frameword.name", "local");
        configuration.set("fs.defaultFS", "file:///");

        //創建Job
        Job job = Job.getInstance(configuration, this.getClass().getSimpleName());

        //設置jar class
        job.setJarByClass(this.getClass());

        //設置輸入文件路徑
        Path inPath = new Path(args[0]);
        FileInputFormat.addInputPath(job, inPath);

        //設置map
        job.setMapperClass(WordCountMapper.class);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);

        //設置reduce
        job.setReducerClass(WordCountReducer.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);

        //設置輸出文件路徑
        Path outPath = new Path(args[1]);
        FileOutputFormat.setOutputPath(job, outPath);

        //提交Job
        job.submit();
        Boolean isSuccess = job.waitForCompletion(true);
        return isSuccess ? 0 : 1;
    }

    public static void main(String[] args) throws Exception {
        args = new String[]{"/input/word.txt", "/output/result/"};
        int status = new MapReduceModel().run(args);
        System.out.println(status);
    }

}

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

MapReduce八股文範式模板(WordCount爲例)

MapReduce-八股文範式模板

mapreduce八股文

Hbase2.x RIT修復

java上傳文件到ftp，能創建目錄，但文件不能寫入（FTPClient.enterLocalPassiveMode()的用法）

scala的Seq與java的List相互轉換

Python如何優雅的格式化XML 【Python XML Format】

華爲C80 yarn提交mapreduce異常：OutOfMemoryError:GC overhead limitexceeded kill -9 %p

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結