1. Reducer 類中 reduce函數外定義的變量是在Reducer機器上屬於全局變量的,因此,一臺機器上reduce函數均可以對該變量的值做出貢獻。如代碼:(sum和count數據Reducer機器上的全局變量)‘
public static class AvgCalReducer extends Reducer<EntityEntityWritable,FloatWritable,EntityEntityWritable,FloatWritable>
{
FloatWritable avg;
float sum=0;
int count=0;
public void reduce(EntityEntityWritable key,Iterable<FloatWritable>values,Context context) throws IOException, InterruptedException
{
System.out.println("reducer starting:");
for (FloatWritable value:values)
{
sum=sum+value.get();
count++;
System.out.println(" key = "+key+" value = "+value.get());
}
System.out.println("average:"+sum/count);
System.out.println("this reducer ending...");
avg=new FloatWritable(sum/count);
context.write(key, avg);
}
}
如果想使sum和count的值僅通過reduce函數進行改變,即只計算同一個key對應value的sum和count,則需要將sum和count放入reduce函數內,如下:
public static class AvgCalReducer extends Reducer<EntityEntityWritable,FloatWritable,EntityEntityWritable,FloatWritable>
{
FloatWritable avg;
public void reduce(EntityEntityWritable key,Iterable<FloatWritable>values,Context context) throws IOException, InterruptedException
{
float sum=0;
int count=0;
System.out.println("reducer starting:");
for (FloatWritable value:values)
{
sum=sum+value.get();
count++;
System.out.println(" key = "+key+" value = "+value.get());
}
System.out.println("average:"+sum/count);
System.out.println("this reducer ending...");
avg=new FloatWritable(sum/count);
context.write(key, avg);
}
}
2. 對於順序組合式MapReduce作業:用兩個job舉例:
Configuration conf1=new Configuration();
Job job1=new Job(conf1,"Job1");
job1.waitForCompletion(true);
Configuration conf2=new Configuration();
Job job2=new Job(conf2,"Job2");
job2.waitForCompletion(true);
注意我們之前經常寫的System.exit(job.waitForCompletion(true)?0:1)在這裏不可以使用,比如第一個job處的(job1.waitForCompletion(true)改成System.exit(job.waitForCompletion(true)?0:1),則系統成功完成job1後正常退出系統,沒有機會再去運行job2了。