Inverted Index (Map Reduce)

Use map reduce to build inverted index for given documents.

思路:

/**
 * Definition of OutputCollector:
 * class OutputCollector<K, V> {
 *     public void collect(K key, V value);
 *         // Adds a key/value pair to the output buffer
 * }
 * Definition of Document:
 * class Document {
 *     public int id;
 *     public String content;
 * }
 */
public class InvertedIndex {

    public static class Map {
        public void map(String _, Document value,
                        OutputCollector<String, Integer> output) {
            // Write your code here
            // Output the results into output buffer.
            // Ps. output.collect(String key, int value);
            StringTokenizer tokenizer = new StringTokenizer(value.content);
            while(tokenizer.hasMoreTokens()){
                String key = tokenizer.nextToken();
                output.collect(key, value.id);
            }
        }
    }

    public static class Reduce {
        public void reduce(String key, Iterator<Integer> values,
                           OutputCollector<String, List<Integer>> output) {
            // Write your code here
            // Output the results into output buffer.
            // Ps. output.collect(String key, List<Integer> value);
            List<Integer> list = new ArrayList<Integer>();
            int pre = -1;
            while(values.hasNext()){
                int val = values.next();
                if(val != pre) {
                    list.add(val);
                }
                pre = val;
            }
            output.collect(key, list);
        }
    }
}

 

發佈了619 篇原創文章 · 獲贊 13 · 訪問量 17萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章