Word Count Example of Hadoop V1.0 – Mapper的實現

原創

2019-06-17 22:06

本文繼續來看Mapper的實現。

Mapper

01 public static class Map
02         extends Mapper<LongWritable, Text, Text, IntWritable> {
03     private final static IntWritable one = new IntWritable(1);
04     private Text word = new Text();
05 
06     public void map(LongWritable key, Text value, Context context)
07             throws IOException, InterruptedException {
08         String line = value.toString();
09         StringTokenizer tokenizer = new StringTokenizer(line);
10         while (tokenizer.hasMoreTokens()) {
11             word.set(tokenizer.nextToken());
12             context.write(word, one);
13         }
14     }
15 }

我們實現了Driver中指定的Map.class，這個類繼承自Mapper<LongWritable, Text, Text, IntWritable>，其中四個類型以此是input的key和value，和Mapper輸出的Key和Value。

03     private final static IntWritable one = new IntWritable(1);
04     private Text word = new Text();

這兩行是定義mapper的輸出，看類型就能看出來，Text是mapper輸出的key，IntWritable是mapper輸出的value。

06     public void map(LongWritable key, Text value, Context context)

繼承Mapper就要實現map函數，LongWritable和Text即mapper的輸入，Context是mapper和Hadoop系統交互的工具。它可以存儲配置數據，還能輸出key-value。

getConfiguration()方法返回一個Configuration，裏面包含了Hadoop Job的配置數據。程序員可以在配置數據里加入任意的key-value對(例如：Job.getConfiguration().set(“Key”, “Value”))，當然，也可以把它讀出來(Context.getConfiguration().get(“Key”))。這種功能一般會在Mapper的setup()方法裏面實現。

map(KeyInType, ValInType, Context)是由Mapper.run()方法調用的。map方法處理數據後，通過Context.write(KeyOutType, ValOutType)方法輸出。

在mapper結束後，應用程序可以通過重載Mapper的cleanup()方法來做一些需要的收尾工作。

Mapper的輸出對的類型不需要和輸入對的類型一致，而且給定一個輸入對，可以對應於0個或多個輸出對。

Context的另一個作用是report progress，可以像Hadoop Job發送任何應用層的狀態信息，或者只是告訴別人自己還活着。

轉載於:https://www.cnblogs.com/licheng/archive/2011/11/08/2241725.html

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Word Count Example of Hadoop V1.0 – Mapper的實現

【簡寫Mybatis-02】註冊機的實現以及SqlSession處理

手繪二維碼

.NET藉助虛擬網卡實現一個簡單異地組網工具

卷積神經網絡（基礎知識回顧）-第七講

一鍵U盤裝系統

Android上安裝第三方庫

neutron dhcp高可用bug修復

怎樣纔算會一門編程語言

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結