Hadoop系列 ( 三 ) MapReduce存在的意義----MapReduce究竟做了些什麼？？

寫在前面： 我是「nicedays」，一枚喜愛做特效，聽音樂，分享技術的大數據開發猿。這名字是來自world order樂隊的一首HAVE A NICE DAY。如今，走到現在很多坎坷和不順，如今終於明白nice day是需要自己賦予的。
白駒過隙，時光荏苒，珍惜當下~~
寫博客一方面是對自己學習的一點點總結及記錄，另一方面則是希望能夠幫助更多對大數據感興趣的朋友。如果你也對 大數據與機器學習感興趣，可以關注我的動態 https://blog.csdn.net/qq_35050438，讓我們一起挖掘數據與人工智能的價值~

文章目錄

HDFS機制--VS--B+樹機制：

三：MapReduce怎麼工作呢？：

四：MapReduce一些小細節與優化：

一：MapReduce?Map and Reduce?

前言： 對於MapReduce，每次刷一遍都會感慨，究竟是誰想出如此巧妙的處理數據的方式，雖然現在它的使用不如以前那麼廣泛了，但是它的核心思想永遠不會被淘汰，真的是非常的巧妙。
不僅運用了“分而治之”，也巧妙的運用了映射和函數處理的思想，03年解決了我們大數據集處理的困擾。

二：MapReduce究竟是什麼？：

分治

對，我覺着它的本質就是分治， MapReduce源於Google一篇論文，它充分借鑑了分而治之的思想，將一個數據處理過程拆分爲主要的Map(映射)與Reduce(化簡)兩步
如果用表達式表示，其過程如下式所示 :
{Keyl，Value1}——>{Key2, List}——>{Key3, Value3}

HDFS機制–VS–B+樹機制：

我們瞭解了它的基本思想，很顯然我們需要分佈式存儲系統來契合它，
爲什麼我們 不用傳統的RDBMS數據庫集羣對大數據進行批量分析呢？

我們先了解一下MapReduce的特點：

MapReduce的天然特點：

MapReduce比較適合以批處理的方式處理需要分析整個數據集的問題。
MapReduce 適合一次寫入、多次讀取數據的應用。
同時Mapreduce很適合處理非結構化的數據，因爲它是等到數據在進行處理的時候纔將其結構進行解釋分析，而且不是數據固有的屬性關係，而是一種抽象的關係，不像關係性數據庫爲了存儲數據把關係和完整性做到規範。
而且MapReduce由於基於分佈式，它的上限很高，當集羣數量增大，它的運算速度就會有提升。

HDFS對於MapRedcue的契合：

天然分佈式的架構
可以存入非結構化數據
對於批存儲友好，對於持續更新讀寫，並不友好。

傳統RDBMS的侷限性：

因爲磁盤的發展趨勢:尋址時間的提高遠遠慢於傳輸速率的提高。尋址是將磁頭移動到特定磁盤位置進行讀寫操作的過程。它是導致磁盤操作延遲的主要原因，而傳輸速率取決於磁盤的帶寬。

而作爲關係型數據庫的代表mysql，由於底層是B+樹，在面對大量數據集的時候包含着大量磁盤尋址的時間（相較於流式數據讀取），同時，數據庫系統每次都需要更新大量數據記錄時，B+樹的（排序/合併）操作相較於mapreduce會浪費大量的時間（如果只是小部分更新，B+樹更有優勢）。

而RDBMS適用於 “點查詢”(point query)和更新，數據集被索引後，數據庫系統能夠提供低延遲的數據檢索和快速的少量數據更新，
因此關係型數據庫更適合持續更新的數據集。

三：MapReduce怎麼工作呢？：

拋開硬件其實就做了兩件事：

將數據處理成適合我統計的樣子
統計得出結果

1.MapReduce處理流程的簡單概括：

先簡單的大致概括一下mapreduce的流程，之後會再細分

是不是有點模糊？我們看看具體的它在幹什麼？

2.MapReduce完整詳細全流程：

第一步： 我們假設有一個待處理的文本260M——>我們首先客戶端submit提交流程，獲取已經寫好的處理數據用到的jar包，一些執行任務的默認的配置文件xml，和分片信息。

針對分片信息如何獲取：調用getSplit（），形成一個按照HDFS塊大小的（切片）方法，並將其邏輯分片，所謂邏輯分片，它內部根本沒分，它只是調取了文件的大小路徑等信息，判斷我需要的分片數量，和我應該分片的文件路徑，給出分片大小的一些信息後， 將其數組排序後發給yarn。
一般的分片規則是按塊大小分，同時也會考慮到回車符劃分的方式，並且分的文件只有大於對應塊大小的1.1倍，則最後的剩餘的小文件纔會獨立存在。

第二步： YARN(RM)拿到切片信息nodemanager計算出maptask數量，（一個split生成一個maptask任務），maptask拿到appmaster分發下來的代碼塊進行工作。

第三步： maptask調用inputformat生成一個recorderReader，recordreader負責把maptask的一個切片處理成k，v值，交給maptask的mapper，

第四步： mapper裏的邏輯處理完後，context.write（k,v），寫給outputCollector，然後它傳給環形緩衝區，

第五步： 緩衝區總有一天會滿，達到它的80%，開始不斷進行溢寫，到那時在溢寫前會先在內存進行分區來確定最後被傳到幾個對應的reduce服務器上，再進行快排使其有序，滿了之後溢寫到磁盤上，形成文件。

第六步： 當數據全部拿到後，因爲數據量較大，不能在內存裏排序，在磁盤對key歸併排序，然後變成了按key有序，與此同時自然而然就完成了分組。

第七步： 現在多個maptask都進行各自的歸併有序得到一組數據後，分別去遷移到對應的reducetask去進行任務，由於reducetask收到的數據可能是多個maptask傳過來的，所以以此再次進行一次歸併排序，得到了有序的相同key不同values，交給redeucer，開發人員編寫統計的邏輯處理後。

第八步： reducer通過context write（k，v）交給outputformat，opf生成recordwriter，然後write輸出結果。

3.MapReduce的時候Yarn在做什麼？：

MapReduce的數據是從HDFS上來的，處理數據計算時，也是在HDFS上的，我們需要通過yarn資源調度器，來讓MapReduce在HDFS上更好的完成任務。
同時yarn也是2.0從mapReduce中分離出來的，瞭解它可以更好地瞭解mapReduce

第一步： 用戶向yarn提交已經寫好的MapReduce程序，yarn的RM與多個NM保持心跳----------在每個DN上的NM定時傳遞自己的當前節點狀態數據給RM管理。Client會拿到

第二步： RM收到Client的 Job提交請求，尋找空閒的DN，將任務給其中一個DN，設爲主DN，主DN在Container處開啓MRAppMaster程序，進行管理計算操作。

第三步： 但是由於HDFS的數據是根據分塊來存儲的，毫無規律。當我們需要使用其他節點的塊的數據時，需要MRAppMaster程序，通過RPC向RM以輪詢的方式對對應的task申請拉取資源的權限，得到權限後，

第四步： AM得到資源權限後，AppMaster將對應任務的啓動信息和資源和代碼塊放在 ContainerLaunchContext ，與對應的NM的ContainersLauncher通信傳遞數據，並通知它讓它啓動對應的任務。

第五步： 任務計算過程中，每個任務也會通過rpc協議向AM彙報自己的狀態和進度，讓AM隨時掌握各個任務的運行狀態，從而在任務失敗時重新啓動任務。AM最後獲得所有節點傳回的結果後，傳給RM。

到reduceTask時，有的時候數據往往需要其他的container的maptask的數據，就需要通知appmaster進行拉取操作，這就是shuffle操作。

4.從執行源碼中更進一步去理解：

Job任務提交做了些啥：

任務啓動後按照以下方法順序執行，我對重要的部分加了自己的註釋：
可以更深刻的去理解它到底在做什麼

waitforcompletion()

public boolean waitForCompletion(boolean verbose
                                   ) throws IOException, InterruptedException,
                                            ClassNotFoundException {
    // 定義過job後就是define狀態
    if (state == JobState.DEFINE) {
      // 提交的方法
      submit();
    }
    if (verbose) {
      monitorAndPrintJob();
    } else {
      // get the completion poll interval from the client.
      int completionPollIntervalMillis = 
        Job.getCompletionPollInterval(cluster.getConf());
      while (!isComplete()) {
        try {
          Thread.sleep(completionPollIntervalMillis);
        } catch (InterruptedException ie) {
        }
      }
    }
    return isSuccessful();
  }

submit()

// submit具體方法
public void submit() 
         throws IOException, InterruptedException, ClassNotFoundException {
    // 再次確認job狀態
    ensureState(JobState.DEFINE);
    // 將舊的api封裝成新的api，兼容舊代碼
    setUseNewAPI();
    // 連接集羣
    connect();
    // 拿到任務提交人，進行提交任務
    final JobSubmitter submitter = 
        getJobSubmitter(cluster.getFileSystem(), cluster.getClient());
    status = ugi.doAs(new PrivilegedExceptionAction<JobStatus>() {
      public JobStatus run() throws IOException, InterruptedException, 
      ClassNotFoundException {
        
        return submitter.submitJobInternal(Job.this, cluster);
      }
    });
    state = JobState.RUNNING;
    LOG.info("The url to track the job: " + getTrackingURL());
   }


// connect具體方法
  private synchronized void connect()
          throws IOException, InterruptedException, ClassNotFoundException {
    // 沒有集羣時生成集羣
    if (cluster == null) {
      // 做判斷究竟是本地的還是yarn的集羣，來新建對應的集羣
      cluster = 
        ugi.doAs(new PrivilegedExceptionAction<Cluster>() {
                   public Cluster run()
                          throws IOException, InterruptedException, 
                                 ClassNotFoundException {
                     return new Cluster(getConfiguration());
                   }
                 });
    }
  }

submitJobInternal()

// 通過任務提交人提交狀態，內部提交job
JobStatus submitJobInternal(Job job, Cluster cluster) 
  throws ClassNotFoundException, InterruptedException, IOException {
    // validate the jobs output specs 
    // 檢查  
    checkSpecs(job);

    Configuration conf = job.getConfiguration();
    addMRFrameworkToDistributedCache(conf);

    Path jobStagingArea = JobSubmissionFiles.getStagingDir(cluster, conf);
    //configure the command line options correctly on the submitting dfs
    InetAddress ip = InetAddress.getLocalHost();
    if (ip != null) {
      submitHostAddress = ip.getHostAddress();
      submitHostName = ip.getHostName();
      conf.set(MRJobConfig.JOB_SUBMITHOST,submitHostName);
      conf.set(MRJobConfig.JOB_SUBMITHOSTADDR,submitHostAddress);
    }
    // 讓yarn集羣給job獲取身份證編號
    JobID jobId = submitClient.getNewJobID();
    job.setJobID(jobId);
    // 有了jobid，yarn就會準備一個臨時文件夾，要運行job的必要文件提交到job文件夾下面
    Path submitJobDir = new Path(jobStagingArea, jobId.toString());
    JobStatus status = null;
    try {
      conf.set(MRJobConfig.USER_NAME,
          UserGroupInformation.getCurrentUser().getShortUserName());
      conf.set("hadoop.http.filter.initializers", 
          "org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer");
      conf.set(MRJobConfig.MAPREDUCE_JOB_DIR, submitJobDir.toString());
      LOG.debug("Configuring job " + jobId + " with " + submitJobDir 
          + " as the submit dir");
      // get delegation token for the dir
      TokenCache.obtainTokensForNamenodes(job.getCredentials(),
          new Path[] { submitJobDir }, conf);
      
      populateTokenCache(conf, job.getCredentials());

      // generate a secret to authenticate shuffle transfers 給了一個可信的shuffle令牌
      if (TokenCache.getShuffleSecretKey(job.getCredentials()) == null) {
        KeyGenerator keyGen;
        try {
         
          int keyLen = CryptoUtils.isShuffleEncrypted(conf) 
              ? conf.getInt(MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA_KEY_SIZE_BITS, 
                  MRJobConfig.DEFAULT_MR_ENCRYPTED_INTERMEDIATE_DATA_KEY_SIZE_BITS)
              : SHUFFLE_KEY_LENGTH;
          keyGen = KeyGenerator.getInstance(SHUFFLE_KEYGEN_ALGORITHM);
          keyGen.init(keyLen);
        } catch (NoSuchAlgorithmException e) {
          throw new IOException("Error generating shuffle secret key", e);
        }
        SecretKey shuffleKey = keyGen.generateKey();
        TokenCache.setShuffleSecretKey(shuffleKey.getEncoded(),
            job.getCredentials());
      }
      copyAndConfigureFiles(job, submitJobDir); 
      Path submitJobFile = JobSubmissionFiles.getJobConfPath(submitJobDir);
      
      // Create the splits for the job
      LOG.debug("Creating splits at " + jtFs.makeQualified(submitJobDir));
      // 切片規則的方法  
      int maps = writeSplits(job, submitJobDir);
      // 把切片的數量設置成maps的數量
      conf.setInt(MRJobConfig.NUM_MAPS, maps);
      LOG.info("number of splits:" + maps);

      // write "queue admins of the queue to which job is being submitted"
      // to job file.
      String queue = conf.get(MRJobConfig.QUEUE_NAME,
          JobConf.DEFAULT_QUEUE_NAME);
      AccessControlList acl = submitClient.getQueueAdmins(queue);
      conf.set(toFullPropertyName(queue,
          QueueACL.ADMINISTER_JOBS.getAclName()), acl.getAclString());

      // removing jobtoken referrals before copying the jobconf to HDFS
      // as the tasks don't need this setting, actually they may break
      // because of it if present as the referral will point to a
      // different job.
      TokenCache.cleanUpTokenReferral(conf);

      if (conf.getBoolean(
          MRJobConfig.JOB_TOKEN_TRACKING_IDS_ENABLED,
          MRJobConfig.DEFAULT_JOB_TOKEN_TRACKING_IDS_ENABLED)) {
        // Add HDFS tracking ids
        ArrayList<String> trackingIds = new ArrayList<String>();
        for (Token<? extends TokenIdentifier> t :
            job.getCredentials().getAllTokens()) {
          trackingIds.add(t.decodeIdentifier().getTrackingId());
        }
        conf.setStrings(MRJobConfig.JOB_TOKEN_TRACKING_IDS,
            trackingIds.toArray(new String[trackingIds.size()]));
      }

      // Set reservation info if it exists
      ReservationId reservationId = job.getReservationId();
      if (reservationId != null) {
        conf.set(MRJobConfig.RESERVATION_ID, reservationId.toString());
      }

      // Write job file to submit dir    
      // core-default  hdfs-default  
      // mapred-default yarn-default 4個xml配置
      // 還有切片和切片元信息和校驗文件都在這個文件夾下
      // 把配置文件寫在job臨時文件夾下
      writeConf(conf, submitJobFile);
      
      // 
      // Now, actually submit the job (using the submit name)
      //
      printTokens(jobId, job.getCredentials());
      status = submitClient.submitJob(
          jobId, submitJobDir.toString(), job.getCredentials());
      if (status != null) {
        return status;
      } else {
        throw new IOException("Could not launch job");
      }
    } finally {
      if (status == null) {
        LOG.info("Cleaning up the staging area " + submitJobDir);
        if (jtFs != null && submitJobDir != null)
          jtFs.delete(submitJobDir, true);

      }
    }
  }

writeSplits()

private int writeSplits(org.apache.hadoop.mapreduce.JobContext job,
      Path jobSubmitDir) throws IOException,
      InterruptedException, ClassNotFoundException {
    JobConf jConf = (JobConf)job.getConfiguration();
    int maps;
    if (jConf.getUseNewMapper()) {
      maps = writeNewSplits(job, jobSubmitDir);
    } else {
      maps = writeOldSplits(jConf, jobSubmitDir);
    }
    return maps;
  }

writeNewSplits()

 private <T extends InputSplit>
  int writeNewSplits(JobContext job, Path jobSubmitDir) throws IOException,
      InterruptedException, ClassNotFoundException {
    Configuration conf = job.getConfiguration();
    // inputformat的一個實例
    InputFormat<?, ?> input =
      ReflectionUtils.newInstance(job.getInputFormatClass(), conf);
	// inputformat進行切片
    List<InputSplit> splits = input.getSplits(job);
    T[] array = (T[]) splits.toArray(new InputSplit[splits.size()]);

    // sort the splits into order based on size, so that the biggest
    // go first
    // 根據大小將分割的部分排序，以便最大的先走
    Arrays.sort(array, new SplitComparator());
    JobSplitWriter.createSplitFiles(jobSubmitDir, conf, 
        jobSubmitDir.getFileSystem(conf), array);
    return array.length;
  }

getSplits()

 public List<InputSplit> getSplits(JobContext job) throws IOException {
    Stopwatch sw = new Stopwatch().start();
     // minSize=1
    long minSize = Math.max(getFormatMinSplitSize(), getMinSplitSize(job));
     // maxSize = long
    long maxSize = getMaxSplitSize(job);

    // generate splits
    List<InputSplit> splits = new ArrayList<InputSplit>();
     // 獲取job文件集的列表
    List<FileStatus> files = listStatus(job);
     // 先遍歷文件，
    for (FileStatus file: files) {
      Path path = file.getPath();
      long length = file.getLen();
      if (length != 0) {
        BlockLocation[] blkLocations;
        if (file instanceof LocatedFileStatus) {
          blkLocations = ((LocatedFileStatus) file).getBlockLocations();
        } else {
          FileSystem fs = path.getFileSystem(job.getConfiguration());
          blkLocations = fs.getFileBlockLocations(file, 0, length);
        }
        // 判斷文件可不可以切，不可切分的壓縮文件就不可以切
        if (isSplitable(job, path)) {
          // 獲取文件塊大小--128M
          long blockSize = file.getBlockSize();
          // 基本上每次都會取到128m
          // 假設我們不想按照128M分，想取maxsize就讓max比128m小，想取minsize就讓minsize比128m大
          long splitSize = computeSplitSize(blockSize, minSize, maxSize);
		  // 當前文件的剩餘的大小
          long bytesRemaining = length;
          // 如果當前文件剩餘大小大於我切片大小的1.1倍我纔會切
          while (((double) bytesRemaining)/splitSize > SPLIT_SLOP) {
            int blkIndex = getBlockIndex(blkLocations, length-bytesRemaining);
            splits.add(makeSplit(path, length-bytesRemaining, splitSize,
                        blkLocations[blkIndex].getHosts(),
                        blkLocations[blkIndex].getCachedHosts()));
            bytesRemaining -= splitSize;
          }

          if (bytesRemaining != 0) {
            int blkIndex = getBlockIndex(blkLocations, length-bytesRemaining);
            // 把切片的規則寫在splits裏面
            splits.add(makeSplit(path, length-bytesRemaining, bytesRemaining,
                       blkLocations[blkIndex].getHosts(),
                       blkLocations[blkIndex].getCachedHosts()));
          }
        } else { // not splitable
          splits.add(makeSplit(path, 0, length, blkLocations[0].getHosts(),
                      blkLocations[0].getCachedHosts()));
        }
      } else { 
        //Create empty hosts array for zero length files
        splits.add(makeSplit(path, 0, length, new String[0]));
      }
    }
    // Save the number of input files for metrics/loadgen
    job.getConfiguration().setLong(NUM_INPUT_FILES, files.size());
    sw.stop();
    if (LOG.isDebugEnabled()) {
      LOG.debug("Total # of splits generated by getSplits: " + splits.size()
          + ", TimeTaken: " + sw.elapsedMillis());
    }
    return splits;
  }

InputFormat是把文件變爲切片，每個切片之後再變成（k，v）對

默認的TextInputFormat
- 切片方法：直接用的FileInputFormat的切片方法
- k,v方法：自己重寫

getRecordReader()

public RecordReader<LongWritable, Text> getRecordReader(
                                          InputSplit genericSplit, JobConf job,
                                          Reporter reporter)
    throws IOException {
    
    reporter.setStatus(genericSplit.toString());
    // 分隔符
    String delimiter = job.get("textinputformat.record.delimiter");
    byte[] recordDelimiterBytes = null;
    if (null != delimiter) {
      recordDelimiterBytes = delimiter.getBytes(Charsets.UTF_8);
    }
    return new LineRecordReader(job, (FileSplit) genericSplit,
        recordDelimiterBytes);
  }

LineRecordReader()

  public LineRecordReader(Configuration job, FileSplit split,
      byte[] recordDelimiter) throws IOException {
    this.maxLineLength = job.getInt(org.apache.hadoop.mapreduce.lib.input.
      LineRecordReader.MAX_LINE_LENGTH, Integer.MAX_VALUE);
    start = split.getStart();
    end = start + split.getLength();
    final Path file = split.getPath();
    compressionCodecs = new CompressionCodecFactory(job);
    codec = compressionCodecs.getCodec(file);

    // open the file and seek to the start of the split
    final FileSystem fs = file.getFileSystem(job);
    fileIn = fs.open(file);
    if (isCompressedInput()) {
      decompressor = CodecPool.getDecompressor(codec);
      if (codec instanceof SplittableCompressionCodec) {
        final SplitCompressionInputStream cIn =
          ((SplittableCompressionCodec)codec).createInputStream(
            fileIn, decompressor, start, end,
            SplittableCompressionCodec.READ_MODE.BYBLOCK);
        in = new CompressedSplitLineReader(cIn, job, recordDelimiter);
        start = cIn.getAdjustedStart();
        end = cIn.getAdjustedEnd();
        filePosition = cIn; // take pos from compressed stream
      } else {
        in = new SplitLineReader(codec.createInputStream(fileIn,
            decompressor), job, recordDelimiter);
        filePosition = fileIn;
      }
    } else {
      fileIn.seek(start);
      in = new SplitLineReader(fileIn, job, recordDelimiter);
      filePosition = fileIn;
    }
    // If this is not the first split, we always throw away first record
    // because we always (except the last split) read one extra line in
    // next() method.
    if (start != 0) {
      start += in.readLine(new Text(), 0, maxBytesToConsume(start));
    }
    this.pos = start;
  }

即實現一個指定的Map映射函數，用來把一組鍵值對映射成新的鍵值對，再把新的鍵值對發送個Reduce規約函數，用來保證所有映射的鍵值對中的每一個共享相同的鍵組

四：MapReduce一些小細節與優化：

之後的一些剩餘我會後續繼續更新。

Hadoop系列 ( 三 ) MapReduce存在的意義----MapReduce究竟做了些什麼？？

文章目錄

一：MapReduce?Map and Reduce?

二：MapReduce究竟是什麼？：

HDFS機制–VS–B+樹機制：

MapReduce的天然特點：

HDFS對於MapRedcue的契合：

傳統RDBMS的侷限性：

三：MapReduce怎麼工作呢？：

1.MapReduce處理流程的簡單概括：

2.MapReduce完整詳細全流程：

3.MapReduce的時候Yarn在做什麼？：

4.從執行源碼中更進一步去理解：

四：MapReduce一些小細節與優化：

linux安裝cuda和cudnn

Mellanox網卡開啓SR-IOV

模擬手機設備：使用 Playwright 實現移動端自動化測試

全面系統的AI學習路徑，幫助普通人也能玩轉AI

HTML 00 Tutorial

從零開始：使用 Playwright 腳本錄製實現自動化測試

uni-app實現上拉加載

vue3編譯優化之“靜態提升”

又是一個月-20240513

flask 如何保證返回json有序

ZooKeeper系列（一）ZooKeeper基本簡介與命令和集羣環境搭建

Hadoop系列 (一) 補--Hadoop完全分佈式環境搭建

Hadoop系列 ( 三 ) MapReduce存在的意義----MapReduce究竟做了些什麼？？

Scala系列（二）Scala數組----超詳細常用方法及其用法

Sqoop系列（一）通過sqoop將關係型數據遷移到HBase和Hive上

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結