Hadoop/MapReduce 好友推薦解決方案

目標:如果用戶A與用戶C同時都跟B是好友,但用戶A與用戶C又不是好友,則向用戶A推薦C,向用戶C推薦A,同時說明A與C的共同好友有哪些

例如:

有如下的好友關係:

1 2,3,4,5,6,7,8
2 1,3,4,5,7
3 1,2
4 1,2,6
5 1,2
6 1,4
7 1,2
8 1

其中每一行空格前的元素爲用戶ID,空格後的元素爲用戶的好友ID列表

其對應的好友關係圖爲


期望輸出爲:

1
2 6(2:[4, 1]),8(1:[1]),
3 4(2:[1, 2]),5(2:[2, 1]),6(1:[1]),7(2:[1, 2]),8(1:[1]),
4 3(2:[2, 1]),5(2:[1, 2]),7(2:[1, 2]),8(1:[1]),
5 3(2:[2, 1]),4(2:[1, 2]),6(1:[1]),7(2:[1, 2]),8(1:[1]),
6 2(2:[1, 4]),3(1:[1]),5(1:[1]),7(1:[1]),8(1:[1]),
7 3(2:[1, 2]),4(2:[2, 1]),5(2:[2, 1]),6(1:[1]),8(1:[1]),
8 2(1:[1]),3(1:[1]),4(1:[1]),5(1:[1]),6(1:[1]),7(1:[1]),


對於用戶1,因爲它以及跟2,3,4,5,6,7,8都是好友,則不向其推薦任何好友

對於用戶2,向其推薦6,因爲2跟6可以通過4或者1認識;向其推薦8,因爲2和8可以通過1認識

對於用戶3,向其推薦4,因爲3跟4可以通過1或者2認識;向其推薦5,因爲3和5可以通過2或者1認識;向其推薦6,因爲3和6可以通過1認識;向其推薦7,因爲3和7可以通過1或者2認識;想起推薦8,因爲3跟8可以通過1認識

...


思路:

對於每一行,例如4 1,2,6

map操作:

生成直接好友鍵值對(4,[1,-1]) (4,[2,-1])  (4,[6,-1])

生成間接好友鍵值對(1,[2,4])    (2,[1,4])    (1,[6,4])    (6,[1,4])    (2,[6,4])    (6,[2,4]]),其中(1,[2,4]),連接爲向1推薦2,因爲可以通過4認識,其他類似


reduce操作:

所有對於同一個用戶的直接好友鍵值對和間接好友鍵值對能夠到達同一個規約器

例如:對於用戶4

key=4

以下鍵值對集合會到達同一個reduce

t2= FriendPair [user1=7, user2=1]
t2= FriendPair [user1=3, user2=2]
t2= FriendPair [user1=2, user2=-1]
t2= FriendPair [user1=6, user2=-1]
t2= FriendPair [user1=1, user2=2]
t2= FriendPair [user1=8, user2=1]
t2= FriendPair [user1=6, user2=1]
t2= FriendPair [user1=5, user2=1]
t2= FriendPair [user1=3, user2=1]
t2= FriendPair [user1=1, user2=6]
t2= FriendPair [user1=2, user2=1]
t2= FriendPair [user1=1, user2=-1]
t2= FriendPair [user1=7, user2=2]
t2= FriendPair [user1=5, user2=2]

對於用戶4,維護一個Map<Long,List<Long>>,用來保存用戶4的推薦好友以及跟該好友的共同好友列表

顯然,對於4的直接好友:即user2爲-1的,應該直接不對其推薦,只需要將<user1,null>放入Map中

對於4的間接好友,應該把推薦ID相同的記錄的共同好友進行累加,如

t2= FriendPair [user1=3, user2=2]

t2= FriendPair [user1=3, user2=1]

則應將給用戶4推薦的用戶3的所有共同好友:用戶2和用戶1進行累加,將<3,[2,1]>放入Map中


代碼實現:

1、自定義好友對

package FriendRecommendation;

import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Writable;
import org.apache.hadoop.io.WritableComparable;


public class FriendPair implements Writable, WritableComparable<FriendPair> {
    private LongWritable user1 = new LongWritable();
    private LongWritable user2 = new LongWritable();
    public FriendPair(){}
    public LongWritable getUser1() {
        return user1;
    }
    public void setUser1(LongWritable user1) {
        this.user1 = user1;
    }
    public LongWritable getUser2() {
        return user2;
    }
    public void setUser2(LongWritable user2) {
        this.user2 = user2;
    }
    public FriendPair(Long user1,Long user2)
    {
        /*if(user1 > user2)
        {
            this.user1.set(user1);
            this.user2.set(user2);
        }
        else
        {
            this.user1.set(user2);
            this.user2.set(user1);
        }*/
        this.user1.set(user1);
        this.user2.set(user2);
    }
    
    @Override
    public int compareTo(FriendPair pair) {
        int compareValue = this.user1.compareTo(pair.user1);
        if (compareValue == 0) {//如果年月相等,再比較溫度
            compareValue = this.user2.compareTo(pair.user2);
        }
        //return compareValue;      // to sort ascending 
        return -1*compareValue;     // to sort descending 
    }

    @Override
    public void readFields(DataInput in) throws IOException {
        user1.readFields(in);
        user2.readFields(in);
    }

    @Override
    public void write(DataOutput out) throws IOException {
        user1.write(out);
        user2.write(out);
    }

    @Override
    public String toString() {
        return "FriendPair [user1=" + user1.get() + ", user2=" + user2.get() + "]";
    }

}


2、

package FriendRecommendation;

import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;

import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

import scala.Tuple2;

public class Main extends Configured implements Tool  {
    public static class FriendRecommendationMapper extends Mapper<LongWritable, Text, LongWritable, FriendPair>
    {
        LongWritable outputKey = new LongWritable();
        @Override
        protected void map(LongWritable key, Text value,Context context)
                throws IOException, InterruptedException {
            //1 2,3,4,5,6,7,8
            System.out.println("map key=" + key);
            System.out.println("map value=" + value);
            Long userID = Long.valueOf( value.toString().split(" ")[0]);
            String[] friends_str = value.toString().split(" ")[1].split(",");
            //發出所有的直接好友關係
            for(String friend : friends_str)
            {
                FriendPair directFriend = new FriendPair(Long.valueOf(friend), -1L);
                outputKey.set(userID);
                System.out.println("directFriend:" + userID + ","+ directFriend);
                context.write(outputKey, directFriend);
            }
          //發出所有可能的好友關係
            for(int i=0;i<friends_str.length;i++)
            {
                for(int j=i+1;j<friends_str.length;j++)
                {
                    FriendPair possibleFriend1 = new FriendPair(Long.valueOf(friends_str[j]), userID);
                    outputKey.set(Long.valueOf(friends_str[i]));
                    System.out.println("possibleFriend1:" + outputKey.get() + ","+ possibleFriend1);
                    context.write(outputKey, possibleFriend1);
                    FriendPair possibleFriend2 = new FriendPair(Long.valueOf(friends_str[i]), userID);
                    outputKey.set(Long.valueOf(friends_str[j]));
                    System.out.println("possibleFriend2:" + outputKey.get() + ","+ possibleFriend2);
                    context.write(outputKey, possibleFriend2);
                }
            }
        }
    }
    
    public static class FriendRecommendationReducer extends Reducer<LongWritable, FriendPair, Text, Text>
    {
        @Override
        protected void reduce(
                LongWritable key,
                Iterable<FriendPair> values,
                Context context)
                throws IOException, InterruptedException {
            System.out.println("reduce key = " + key);
            
            Map<Long,List<Long>> mutualFriends = new HashMap<Long,List<Long>>();
            Iterator<FriendPair> iterator = values.iterator();
            while(iterator.hasNext())
            {
                FriendPair t2 = iterator.next();
                System.out.println("t2= " + t2);
                Long toUser = t2.getUser1().get();
                Long mutualFriend = t2.getUser2().get();
                boolean alreadyFriend = (mutualFriend == -1);
                if(mutualFriends.containsKey(toUser))
                {
                    if(alreadyFriend)
                    {
                        mutualFriends.put(toUser, null);
                    }
                    else if(mutualFriends.get(toUser) != null)
                    {
                        mutualFriends.get(toUser).add(mutualFriend);
                    }
                }
                else
                {
                    if(alreadyFriend)
                    {
                        mutualFriends.put(toUser, null);
                    }
                    else
                    {
                        List<Long> list = new ArrayList<Long>();
                        list.add(mutualFriend);
                        mutualFriends.put(toUser,list);
                    }
                }
            }
            String reducerOutput = buildOutput(mutualFriends);
            Text outputKey = new Text();
            Text outputValue = new Text();
            outputKey.set("" + key);
            outputValue.set(reducerOutput);
            context.write(outputKey, outputValue);
        }
    }
 
    public static String buildOutput(Map<Long,List<Long>> map)
    {
        String output = "";
        for(Map.Entry<Long, List<Long>> entry : map.entrySet())
        {
            Long K = entry.getKey();
            List<Long> V = entry.getValue();
            if(V!=null)
            output += K + "(" + V.size() + ":" + V + "),";
        }
        return output;
    }
    
    public static void main(String[] args) throws Exception {
        args = new String[2];
        args[0] = "input/friends2.txt";
        args[1] = "output/friends2";
        int jobStatus = submitJob(args);
        System.exit(jobStatus);
    }

    public static int submitJob(String[] args) throws Exception {
        int jobStatus = ToolRunner.run(new Main(), args);
        return jobStatus;
    }
    
    @Override
    public int run(String[] args) throws Exception {
        Job job = new Job(getConf());
        job.setJobName("CommonFriendsDriver");


        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);

        job.setOutputKeyClass(LongWritable.class);       
        job.setOutputValueClass(FriendPair.class);      

        job.setMapperClass(FriendRecommendationMapper.class);
        job.setReducerClass(FriendRecommendationReducer.class);

        // args[0] = input directory
        // args[1] = output directory
        FileInputFormat.setInputPaths(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        boolean status = job.waitForCompletion(true);
        return status ? 0 : 1;
    }

}

結果:

1
2 6(2:[4, 1]),8(1:[1]),
3 4(2:[1, 2]),5(2:[2, 1]),6(1:[1]),7(2:[1, 2]),8(1:[1]),
4 3(2:[2, 1]),5(2:[1, 2]),7(2:[1, 2]),8(1:[1]),
5 3(2:[2, 1]),4(2:[1, 2]),6(1:[1]),7(2:[1, 2]),8(1:[1]),
6 2(2:[1, 4]),3(1:[1]),5(1:[1]),7(1:[1]),8(1:[1]),
7 3(2:[1, 2]),4(2:[2, 1]),5(2:[2, 1]),6(1:[1]),8(1:[1]),
8 2(1:[1]),3(1:[1]),4(1:[1]),5(1:[1]),6(1:[1]),7(1:[1]),


給出一個另一個簡單輸入的執行過程詳解:




發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章