Hadoop之JobTracker源碼分析

JobTracker源碼分析


前言

JobTracker是Hadoop中的一個重要角色,負責任務的調度和分配,和client端的任務提交也有關係,這次主要分析JobTracker中JobTracker和TaskTracker心跳機制在JobTracker這端的詳細實現過程以及client提交的任務是如何被處理然後分配給TaskTracker的。

JobTracker啓動

在hadoop-1.1.2的src/mapred裏面的org.apache.hadoop.mapred裏面可以找到JobTracker.java,說到啓動,JobTracker是以一個單獨的進程運行的,因此可以發現JobTracker.java裏面是有主函數的,進入到main()函數

  1. public static void main(String argv[]
  2. ) throws IOException, InterruptedException {
  3. //打印一條啓動的日誌
  4. StringUtils.startupShutdownMessage(JobTracker.class, argv, LOG);
  5. try {
  6. if(argv.length == 0) {
  7. JobTracker tracker = startTracker(new JobConf()); //核心的啓動函數,在裏面實例了幾個重要對象
  8. tracker.offerService(); //將某些重要對象啓動,作爲rpcserver或者httpserver
  9. }
  10. else {
  11. if ("-dumpConfiguration".equals(argv[0]) && argv.length == 1) {
  12. dumpConfiguration(new PrintWriter(System.out));
  13. }
  14. else {
  15. System.out.println("usage: JobTracker [-dumpConfiguration]");
  16. System.exit(-1);
  17. }
  18. }
  19. } catch (Throwable e) {
  20. LOG.fatal(StringUtils.stringifyException(e));
  21. System.exit(-1);
  22. }
  23. }

進入到startTracker函數,經過幾次默認參數的調用最後到達的startTracker如下

  1. public static JobTracker startTracker(JobConf conf, String identifier, boolean initialize)
  2. throws IOException, InterruptedException {
  3. DefaultMetricsSystem.initialize("JobTracker");
  4. JobTracker result = null;
  5. while (true) {
  6. try {
  7. result = new JobTracker(conf, identifier); //JobTracker構造函數,其中identifier是通過日期產生的
  8. result.taskScheduler.setTaskTrackerManager(result); //將taskScheduler的taskTrackerManager設置爲JobTracker自身,這個
  9.                            //的作用在後面的代碼中會說明
  10. break;
  11. } catch (VersionMismatch e) {
  12. throw e;
  13. } catch (BindException e) {
  14. throw e;
  15. } catch (UnknownHostException e) {
  16. throw e;
  17. } catch (AccessControlException ace) {
  18. // in case of jobtracker not having right access
  19. // bail out
  20. throw ace;
  21. } catch (IOException e) {
  22. LOG.warn("Error starting tracker: " +
  23. StringUtils.stringifyException(e));
  24. }
  25. Thread.sleep(1000);
  26. }
  27. if (result != null) {
  28. JobEndNotifier.startNotifier(); //開啓一個Job結束的通知器,其實內部是開一個線程觀察一個BlockingQueue<jobEndStatusInfo>的隊列,當TaskTracker通過RPC告訴JobTracker Job的運行狀態後,如果結束就會在某個地方往這個隊列裏面加入任務的結束狀態信息,然後
  29. 該觀察線程就會發出HttpNotification,而接受這個Notification的應該是JobTracker內部的一個HttpServer
  30. MBeans.register("JobTracker", "JobTrackerInfo", result);
  31. if(initialize == true) {
  32. result.setSafeModeInternal(SafeModeAction.SAFEMODE_ENTER);
  33. result.initializeFilesystem(); //初始化文件系統,其實就是通過`return FileSystem.get(conf);`返回一個文件系統類,內部有一個緩存,首先判斷緩存中是否有Key(uri,conf)對象的文件系統類,有就直接返回,否則通過反射生成一個。uri是conf中fs.defult.name的屬性值,如果該值沒有設置則使用file:///
  34. result.setSafeModeInternal(SafeModeAction.SAFEMODE_LEAVE);
  35. result.initialize(); //初始化,主要是對JobHistory做了初始化,然後設置了httpserver的屬性,並且開啓了一個HDFS monitro線程,這裏不是這次的關注點,可以略過
  36. }
  37. }
  38. return result;
  39. }

上個代碼段中很重要的一個調用就是JobTracker的構造函數,幾個重要對象例如taskScheduler和interTrackerServer以及HttpServer都是在裏面實例化的。

  1. Class<? extends TaskScheduler> schedulerClass
  2. = conf.getClass("mapred.jobtracker.taskScheduler",
  3. JobQueueTaskScheduler.class, TaskScheduler.class); //其實就是看conf裏面有沒有配置不同的TaskScheduler,系統默認的是採用
  4. //JobQueueTaskScheduler,這個也是這次講的時候也以它爲例說明
  5. taskScheduler = (TaskScheduler) ReflectionUtils.newInstance(schedulerClass, conf); //通過反射實例化taskScheduler
  6. int handlerCount = conf.getInt("mapred.job.tracker.handler.count", 10); // Handler的個數,默認10個,Handler後面會降到,這裏先略過
  7. this.interTrackerServer =
  8. RPC.getServer(this, addr.getHostName(), addr.getPort(), handlerCount,
  9. false, conf, secretManager);  //interTrackerServer是一個RPC server,它內部有listener, reader ,responder, handler這樣幾個比較重要的對象,用來接受處理和響應RPC請求。
  10. infoServer = new HttpServer("job", infoBindAddress, tmpInfoPort,
  11. tmpInfoPort == 0, conf, aclsManager.getAdminsAcl()); //HttpServer,也就是通過50070端口可以訪問到JObTracker信息的server
  12. infoServer.setAttribute("job.tracker", this);          //不是這次的說明範圍內,大致知道它是一個httpserver就行了
  13. infoServer.addServlet("reducegraph", "/taskgraph", TaskGraphServlet.class); 
  14. infoServer.start();

JobTracker中的幾個重要對象

taskScheduler:任務調度器,這裏以默認的JobQueueTaskScheduler來說明其作用,JobQueueTaskScheduler內部會初始化JobQueueJobInProgressListener和EagerTaskInitializationListener,其中EagerTaskInitializationListener會開啓內部的一個初始化線程監聽JobInitQueue,然後如果發現隊列裏面有待初始化的Job則取出然後調用了ttm(task tracker manager)的initJob函數,其實會發現這個ttm就是jobtracker本身,這樣就會調用jobtracker的initJob函數。而jobQueueJobInProgressListener則是在jobtracker接受到client端的submitjob請求的時候會調用jobAdded添加job到jobQUeue中然後在tasktracker與jobtracker的心跳中,分發任務的時候會從jobQUeue中取出來。

intertrackerServer:其實就是一個RPC的server,但是裏面通過了listener、reader、handler、responder這幾個對象和NIO實現了非阻塞異步IO。其中listener主要用來監聽do_accept,reader監聽do_read,handler處理rpc請求內容,responder負責根據調用結果返回給client

HttpServer: 這個就是jobtracker web接口的server,這裏沒有具體去分析。

intertackerServer分析

intertrackerServer的啓動在offerService函數中,intertrackerServer.start(),其實就是開啓一個rpc server。要理解intertackerServer幹了些什麼,這個rpc server是如何接受和處理請求的,得從intertrackerServer = RPC.getServer看起

  1. public static Server getServer(final Object instance, final String bindAddress, final int port,
  2. final int numHandlers,
  3. final boolean verbose, Configuration conf)
  4. throws IOException {
  5. return getServer(instance, bindAddress, port, numHandlers, verbose, conf, null);
  6. }
  7. /** Construct a server for a protocol implementation instance listening on a
  8. * port and address, with a secret manager. */
  9. public static Server getServer(final Object instance, final String bindAddress, final int port,
  10. final int numHandlers,
  11. final boolean verbose, Configuration conf,
  12. SecretManager<? extends TokenIdentifier> secretManager)
  13. throws IOException {
  14. return new Server(instance, conf, bindAddress, port, numHandlers, verbose, secretManager);
  15. }
  16. //這個Server函數是RPC類中的內部類Server,它的super類是ipc.Server
  17. public Server(Object instance, Configuration conf, String bindAddress, int port,
  18. int numHandlers, boolean verbose,
  19. SecretManager<? extends TokenIdentifier> secretManager)
  20. throws IOException {
  21. super(bindAddress, port, Invocation.class, numHandlers, conf,
  22. classNameBase(instance.getClass().getName()), secretManager); //第三個參數Incocation.class需要注意,它是rpc中的調用信息 //的存儲類,在這個調用中會實例化listener和responder
  23. this.instance = instance;
  24. this.verbose = verbose;
  25. }

其實可以看到,getServer最後會調用new Server生成一個rpcserver,new Server(instance, conf, bindAddress,port, numHandlers,verbose,secretManager)這個函數需要注意的是第一個參數instance,它其實就是jobtracker的實例,在RPC.Call函數中將要用到這個變量。

在jobTracker的 offerservice函數中會調用interTrackerServer.start函數,由於RPC.Server繼承了ipc.Server,其實就是調用ipc.Server的start方法如下:

  1. /** Starts the service. Must be called before any calls will be handled. */
  2. public synchronized void start() {
  3. responder.start(); //開啓一個響應線程
  4. listener.start(); //開啓一個監聽和接收的線程
  5. handlers = new Handler[handlerCount]; //實際的處理線程
  6. for (int i = 0; i < handlerCount; i++) { //handlerCount是從conf中獲得的
  7. handlers[i] = new Handler(i);
  8. handlers[i].start();
  9. }
  10. }

listener、responder、handler之間其實就是使用了selector機制+blockingqueue傳遞一些調用對象。
listener

  1. public Listener() throws IOException {
  2. address = new InetSocketAddress(bindAddress, port);
  3. // Create a new server socket and set to non blocking mode
  4. acceptChannel = ServerSocketChannel.open(); //打開serversocketChannel
  5. acceptChannel.configureBlocking(false); //設置爲非阻塞,那麼connect,read,write都是非阻塞的
  6. // Bind the server socket to the local host and port
  7. bind(acceptChannel.socket(), address, backlogLength); //綁定到地址
  8. port = acceptChannel.socket().getLocalPort(); //Could be an ephemeral port
  9. // create a selector;
  10. selector= Selector.open(); //初始化selector
  11. readers = new Reader[readThreads]; //初始化了reader,負責OP_READ事件的處理
  12. readPool = Executors.newFixedThreadPool(readThreads); //對read操作使用了線程池
  13. for (int i = 0; i < readThreads; i++) {
  14. Selector readSelector = Selector.open();
  15. Reader reader = new Reader(readSelector);
  16. readers[i] = reader;
  17. readPool.execute(reader);
  18. }
  19. // Register accepts on the server socket with the selector.
  20. acceptChannel.register(selector, SelectionKey.OP_ACCEPT); //註冊acceptChannel對ACCEPT事件感興趣
  21. this.setName("IPC Server listener on " + port);
  22. this.setDaemon(true); //設置爲守護線程,當主線程結束時跟着結束
  23. }
  24. //listener線程會負責accept事件,doAccept爲其處理accept事件的函數
  25. void doAccept(SelectionKey key) throws IOException, OutOfMemoryError {
  26. Connection c = null;
  27. ServerSocketChannel server = (ServerSocketChannel) key.channel();
  28. SocketChannel channel;
  29. while ((channel = server.accept()) != null) {
  30. channel.configureBlocking(false);
  31. channel.socket().setTcpNoDelay(tcpNoDelay);
  32. Reader reader = getReader(); //獲取reader
  33. try {
  34. reader.startAdd();  //會讓reader wait 然後在設置好channel的關注事件等等之後會通過finishAdd將其喚醒,執行read的操作
  35. SelectionKey readKey = reader.registerChannel(channel); //給channel註冊OP_READ事件
  36. c = new Connection(readKey, channel, System.currentTimeMillis()); //創建Connection對象
  37. readKey.attach(c); //將connection綁定到readkey
  38. synchronized (connectionList) {
  39. connectionList.add(numConnections, c);
  40. numConnections++;
  41. }
  42. if (LOG.isDebugEnabled())
  43. LOG.debug("Server connection from " + c.toString() +
  44. "; # active connections: " + numConnections +
  45. "; # queued calls: " + callQueue.size());
  46. } finally {
  47. reader.finishAdd();  //喚醒並讓reader進入到reader的事件處理過程中
  48. }
  49. }
  50. }

reader:

  1. Reader(Selector readSelector) {
  2. this.readSelector = readSelector;
  3. }
  4. public void run() {
  5. LOG.info("Starting SocketReader");
  6. synchronized (this) {
  7. while (running) { //不停的循環執行
  8. SelectionKey key = null;
  9. try {
  10. readSelector.select(); //阻塞的監聽,當readSelector.wakeup被調用或者有一個channel有關注的事件發生才往下執行
  11. while (adding) { //當listener調用startAdd的時候會設置adding爲true
  12. this.wait(1000);
  13. }
  14. Iterator<SelectionKey> iter = readSelector.selectedKeys().iterator();
  15. while (iter.hasNext()) {
  16. key = iter.next();
  17. iter.remove();
  18. if (key.isValid()) {
  19. if (key.isReadable()) {
  20. doRead(key); //調用doRead讀取一些rpcHeader,做一些驗證和校驗以及反序列話ConnectionHeader。把call(id,param,connection)放到callQueue裏面(其中param是反序列化的Invocation對象),handler線程會監視callQueque然後從中取出call,然後調用
  21. }
  22. }
  23. key = null;
  24. }
  25. } catch (InterruptedException e) {
  26. if (running) { // unexpected -- log it
  27. LOG.info(getName() + " caught: " +
  28. StringUtils.stringifyException(e));
  29. }
  30. } catch (IOException ex) {
  31. LOG.error("Error in Reader", ex);
  32. }
  33. }
  34. }
  35. }

Handler

  1. @Override
  2. public void run() {
  3. LOG.info(getName() + ": starting");
  4. SERVER.set(Server.this);
  5. ByteArrayOutputStream buf =
  6. new ByteArrayOutputStream(INITIAL_RESP_BUF_SIZE);
  7. while (running) {
  8. try {
  9. final Call call = callQueue.take(); // pop the queue; maybe blocked here
  10. //讀取隊列中的call對象
  11. if (LOG.isDebugEnabled())
  12. LOG.debug(getName() + ": has #" + call.id + " from " +
  13. call.connection);
  14. String errorClass = null;
  15. String error = null;
  16. Writable value = null;
  17. CurCall.set(call);
  18. try {
  19. // Make the call as the user via Subject.doAs, thus associating
  20. // the call with the Subject
  21. if (call.connection.user == null) {
  22. value = call(call.connection.protocol, call.param,
  23. call.timestamp); //call.connection.protocol其實是在ConnectionHeader裏面得到的,在jobTracker和tasktracker調用的時候該prototcol是intertrackerServer.class,在client提交任務和jobTracker交互時是JobSubmissionProtocol
  24. } else {
  25. value =
  26. call.connection.user.doAs
  27. (new PrivilegedExceptionAction<Writable>() {
  28. @Override
  29. public Writable run() throws Exception {
  30. // make the call
  31. return call(call.connection.protocol,
  32. call.param, call.timestamp);
  33. }
  34. }
  35. );
  36. }
  37. } catch (Throwable e) {
  38. LOG.info(getName()+", call "+call+": error: " + e, e);
  39. errorClass = e.getClass().getName();
  40. error = StringUtils.stringifyException(e);
  41. }
  42. CurCall.set(null);
  43. synchronized (call.connection.responseQueue) {
  44. // setupResponse() needs to be sync'ed together with
  45. // responder.doResponse() since setupResponse may use
  46. // SASL to encrypt response data and SASL enforces
  47. // its own message ordering.
  48. setupResponse(buf, call,
  49. (error == null) ? Status.SUCCESS : Status.ERROR,
  50. value, errorClass, error); //    //把value結果+call.id+status狀態序列化到一個ByteBuffer裏面,然後設置給call對象中的response對象
  51. // Discard the large buf and reset it back to
  52. // smaller size to freeup heap
  53. if (buf.size() > maxRespSize) {
  54. LOG.warn("Large response size " + buf.size() + " for call " +
  55. call.toString());
  56. buf = new ByteArrayOutputStream(INITIAL_RESP_BUF_SIZE);
  57. }
  58. responder.doRespond(call); // 將執行的結果反饋給client端
  59. }
  60. } catch (InterruptedException e) {
  61. if (running) { // unexpected -- log it
  62. LOG.info(getName() + " caught: " +
  63. StringUtils.stringifyException(e));
  64. }
  65. } catch (Exception e) {
  66. LOG.info(getName() + " caught: " +
  67. StringUtils.stringifyException(e));
  68. }
  69. }
  70. LOG.info(getName() + ": exiting");
  71. }
  72. }
  73. //被調用的Call函數
  74. public Writable call(Class<?> protocol, Writable param, long receivedTime)
  75. throws IOException {
  76. try {
  77. Invocation call = (Invocation)param; //將從rpc通信中反序列化過來的Invocation對象向下轉型
  78. if (verbose) log("Call: " + call);
  79. Method method =
  80. protocol.getMethod(call.getMethodName(),
  81. call.getParameterClasses()); //通過反射獲得方法對象
  82. method.setAccessible(true);  //設置爲可以訪問,setAccessible相關的內容可以參見我的另一篇博客java security manager分析
  83. long startTime = System.currentTimeMillis();
  84. Object value = method.invoke(instance, call.getParameters()); //通過java反射去調用執行jobTracker的rpc方法
  85. int processingTime = (int) (System.currentTimeMillis() - startTime);
  86. int qTime = (int) (startTime-receivedTime);
  87. if (LOG.isDebugEnabled()) {
  88. LOG.debug("Served: " + call.getMethodName() +
  89. " queueTime= " + qTime +
  90. " procesingTime= " + processingTime);
  91. }
  92. rpcMetrics.addRpcQueueTime(qTime);
  93. rpcMetrics.addRpcProcessingTime(processingTime);
  94. rpcMetrics.addRpcProcessingTime(call.getMethodName(), processingTime);
  95. if (verbose) log("Return: "+value);
  96. return new ObjectWritable(method.getReturnType(), value); //調用完畢後的返回結果
  97. } catch (InvocationTargetException e) {
  98. Throwable target = e.getTargetException();
  99. if (target instanceof IOException) {
  100. throw (IOException)target;
  101. } else {
  102. IOException ioe = new IOException(target.toString());
  103. ioe.setStackTrace(target.getStackTrace());
  104. throw ioe;
  105. }
  106. } catch (Throwable e) {
  107. if (!(e instanceof IOException)) {
  108. LOG.error("Unexpected throwable object ", e);
  109. }
  110. IOException ioe = new IOException(e.toString());
  111. ioe.setStackTrace(e.getStackTrace());
  112. throw ioe;
  113. }
  114. }

Responder:
在前面的Handler的run裏面會調用responder.doResponde(call),內部調用如下,使用了一個同步操作,

  1. void doRespond(Call call) throws IOException {
  2. synchronized (call.connection.responseQueue) { //同步操作,將call放到一個隊列裏面然後調用processResponse來處理這個隊列,     因爲有多個handler所以要加synchronized同步關鍵詞,這個responseQueue其實主要是爲了當將執行的結果寫到對應的channel的時候沒有寫完,那麼會加入到這個隊列,然後由responder異步的去從隊列裏面去取,然後寫到對應的channel中
  3. call.connection.responseQueue.addLast(call);
  4. if (call.connection.responseQueue.size() == 1) {
  5. processResponse(call.connection.responseQueue, true);
  6. }
  7. }
  8. }
  9. //responder的run函數,裏面主要是監聽channel是否可寫然後調用processResponse函數,processResponse函數其實就是將call.reponse這個對象寫出到對應的channel即可,裏面如果發生一次沒有寫完的情況,會通知responder去異步寫。如果返回的數據過大,會以很多chunk塊的方式去寫。
  10. @Override
  11. public void run() {
  12. LOG.info(getName() + ": starting");
  13. SERVER.set(Server.this);
  14. long lastPurgeTime = 0; // last check for old calls.  //引入了一個purge interval時間,當一次circle也就是執行一遍,如果異步寫調用操作時間過長,或者有些channel長時間select沒有發現可寫狀態,那麼就會將這些連接關閉
  15. while (running) {
  16. try {
  17. waitPending(); // If a channel is being registered, wait.
  18. writeSelector.select(PURGE_INTERVAL); //監聽可寫狀態,超時時間是purge_interval
  19. Iterator<SelectionKey> iter = writeSelector.selectedKeys().iterator();
  20. while (iter.hasNext()) {
  21. SelectionKey key = iter.next();
  22. iter.remove();
  23. try {
  24. if (key.isValid() && key.isWritable()) {
  25. doAsyncWrite(key); //調用異步寫函數,內部其實就是調用了processResponse
  26. }
  27. } catch (IOException e) {
  28. LOG.info(getName() + ": doAsyncWrite threw exception " + e);
  29. }
  30. }
  31. long now = System.currentTimeMillis();
  32. if (now < lastPurgeTime + PURGE_INTERVAL) {
  33. continue;
  34. }
  35. lastPurgeTime = now;
  36. //
  37. // If there were some calls that have not been sent out for a
  38. // long time, discard them.
  39. //
  40. LOG.debug("Checking for old call responses.");
  41. ArrayList<Call> calls;
  42. // get the list of channels from list of keys.
  43. synchronized (writeSelector.keys()) {
  44. calls = new ArrayList<Call>(writeSelector.keys().size());
  45. iter = writeSelector.keys().iterator();
  46. while (iter.hasNext()) {
  47. SelectionKey key = iter.next();
  48. Call call = (Call)key.attachment();
  49. if (call != null && key.channel() == call.connection.channel) {
  50. calls.add(call);
  51. }
  52. }
  53. }
  54. for(Call call : calls) {
  55. try {
  56. doPurge(call, now); //關閉那些超時的connection,會把connection裏面的channel關閉
  57. } catch (IOException e) {
  58. LOG.warn("Error in purging old calls " + e);
  59. }
  60. }
  61. } catch (OutOfMemoryError e) {
  62. //
  63. // we can run out of memory if we have too many threads
  64. // log the event and sleep for a minute and give
  65. // some thread(s) a chance to finish
  66. //
  67. LOG.warn("Out of Memory in server select", e);
  68. try { Thread.sleep(60000); } catch (Exception ie) {}
  69. } catch (Exception e) {
  70. LOG.warn("Exception in Responder " +
  71. StringUtils.stringifyException(e));
  72. }
  73. }
  74. LOG.info("Stopping " + this.getName());
  75. }

TaskScheduler分析

直接看JobQueueTaskScheduler吧

  1. public JobQueueTaskScheduler() {
  2. this.jobQueueJobInProgressListener = new JobQueueJobInProgressListener(); //初始化jobQueueJobInProgressListener,它主要監聽
  3. 來自job提交那邊的操作,它維護了一個jobQueue隊列,client那邊提交job的時候就會往裏面加入job。當然它還有jobUpdatedjobRemoved等方法。
  4. }
  5. @Override
  6. public synchronized void start() throws IOException {
  7. super.start();
  8. taskTrackerManager.addJobInProgressListener(jobQueueJobInProgressListener);
  9. eagerTaskInitializationListener.setTaskTrackerManager(taskTrackerManager); //把tasktrackermanager設置爲jobTracker自身
  10. eagerTaskInitializationListener.start();  //eagerTaskInitializationListener是用來初始化任務的,最後其實會調用jobTracker的initJOb函數。這裏就不具體貼代碼了
  11. taskTrackerManager.addJobInProgressListener(
  12. eagerTaskInitializationListener);
  13. }
  14. //JobTracker的initJob函數,其實主要是其中直接調用了job.initTasks函數,其他的主要是job狀態更新之類的東西。
  15. public void initJob(JobInProgress job) {
  16. if (null == job) {
  17. LOG.info("Init on null job is not valid");
  18. return;
  19. }
  20. try {
  21. JobStatus prevStatus = (JobStatus)job.getStatus().clone();
  22. LOG.info("Initializing " + job.getJobID());
  23. job.initTasks();  //核心的初始化函數,裏面主要通過讀取metasplitinfo,然後根據map和reduce的數目生成nonRunningMapCache和nonRunningReduceCache之類的東西。
  24. // Inform the listeners if the job state has changed
  25. // Note : that the job will be in PREP state.
  26. JobStatus newStatus = (JobStatus)job.getStatus().clone();
  27. if (prevStatus.getRunState() != newStatus.getRunState()) {
  28. JobStatusChangeEvent event =
  29. new JobStatusChangeEvent(job, EventType.RUN_STATE_CHANGED, prevStatus,
  30. newStatus);
  31. synchronized (JobTracker.this) {
  32. updateJobInProgressListeners(event);
  33. }
  34. }
  35. } catch (KillInterruptedException kie) {
  36. // If job was killed during initialization, job state will be KILLED
  37. LOG.error("Job initialization interrupted:\n" +
  38. StringUtils.stringifyException(kie));
  39. killJob(job);
  40. } catch (Throwable t) {
  41. String failureInfo =
  42. "Job initialization failed:\n" + StringUtils.stringifyException(t);
  43. // If the job initialization is failed, job state will be FAILED
  44. LOG.error(failureInfo);
  45. job.getStatus().setFailureInfo(failureInfo);
  46. failJob(job);
  47. }
  48. }

job.initTasks是核心的初始化函數,裏面主要是實例化了NonRunningMapCache(Map類型)和NonRunningReduces(Set),這些對象在tasktracker和jobTracker心跳分配任務的時候,jobTracker調用sheduler.assignTasks的時候會在內部使用這些對象然後根據TaskInprogress以及taskid生成Task,然後封裝成TaskAction然後發
送給TaskTracker

  1. */
  2. public synchronized void initTasks()
  3. throws IOException, KillInterruptedException, UnknownHostException {
  4. if (tasksInited || isComplete()) {
  5. return;
  6. }
  7. synchronized(jobInitKillStatus){
  8. if(jobInitKillStatus.killed || jobInitKillStatus.initStarted) {
  9. return;
  10. }
  11. jobInitKillStatus.initStarted = true;
  12. }
  13. LOG.info("Initializing " + jobId);
  14. final long startTimeFinal = this.startTime;
  15. // log job info as the user running the job
  16. try {
  17. userUGI.doAs(new PrivilegedExceptionAction<Object>() {
  18. @Override
  19. public Object run() throws Exception {
  20. JobHistory.JobInfo.logSubmitted(getJobID(), conf, jobFile,
  21. startTimeFinal, hasRestarted());
  22. return null;
  23. }
  24. });
  25. } catch(InterruptedException ie) {
  26. throw new IOException(ie);
  27. }
  28. // log the job priority
  29. setPriority(this.priority);
  30. //
  31. // generate security keys needed by Tasks
  32. //
  33. generateAndStoreTokens();
  34. //
  35. // read input splits and create a map per a split
  36. //
  37. TaskSplitMetaInfo[] splits = createSplits(jobId);
  38. if (numMapTasks != splits.length) {
  39. throw new IOException("Number of maps in JobConf doesn't match number of " +
  40. "recieved splits for job " + jobId + "! " +
  41. "numMapTasks=" + numMapTasks + ", #splits=" + splits.length);
  42. }
  43. numMapTasks = splits.length;
  44. // Sanity check the locations so we don't create/initialize unnecessary tasks
  45. for (TaskSplitMetaInfo split : splits) {
  46. NetUtils.verifyHostnames(split.getLocations());
  47. }
  48. jobtracker.getInstrumentation().addWaitingMaps(getJobID(), numMapTasks);
  49. jobtracker.getInstrumentation().addWaitingReduces(getJobID(), numReduceTasks);
  50. this.queueMetrics.addWaitingMaps(getJobID(), numMapTasks);
  51. this.queueMetrics.addWaitingReduces(getJobID(), numReduceTasks);
  52. maps = new TaskInProgress[numMapTasks];
  53. for(int i=0; i < numMapTasks; ++i) {
  54. inputLength += splits[i].getInputDataLength();
  55. maps[i] = new TaskInProgress(jobId, jobFile,
  56. splits[i],
  57. jobtracker, conf, this, i, numSlotsPerMap);
  58. }
  59. LOG.info("Input size for job " + jobId + " = " + inputLength
  60. + ". Number of splits = " + splits.length);
  61. // Set localityWaitFactor before creating cache
  62. localityWaitFactor =
  63. conf.getFloat(LOCALITY_WAIT_FACTOR, DEFAULT_LOCALITY_WAIT_FACTOR);
  64. if (numMapTasks > 0) {
  65. nonRunningMapCache = createCache(splits, maxLevel);
  66. }
  67. // set the launch time
  68. this.launchTime = jobtracker.getClock().getTime();
  69. //
  70. // Create reduce tasks
  71. //
  72. this.reduces = new TaskInProgress[numReduceTasks];
  73. for (int i = 0; i < numReduceTasks; i++) {
  74. reduces[i] = new TaskInProgress(jobId, jobFile,
  75. numMapTasks, i,
  76. jobtracker, conf, this, numSlotsPerReduce);
  77. nonRunningReduces.add(reduces[i]);
  78. }
  79. // Calculate the minimum number of maps to be complete before
  80. // we should start scheduling reduces
  81. completedMapsForReduceSlowstart =
  82. (int)Math.ceil(
  83. (conf.getFloat("mapred.reduce.slowstart.completed.maps",
  84. DEFAULT_COMPLETED_MAPS_PERCENT_FOR_REDUCE_SLOWSTART) *
  85. numMapTasks));
  86. // ... use the same for estimating the total output of all maps
  87. resourceEstimator.setThreshhold(completedMapsForReduceSlowstart);
  88. // create cleanup two cleanup tips, one map and one reduce.
  89. cleanup = new TaskInProgress[2];
  90. // cleanup map tip. This map doesn't use any splits. Just assign an empty
  91. // split.
  92. TaskSplitMetaInfo emptySplit = JobSplit.EMPTY_TASK_SPLIT;
  93. cleanup[0] = new TaskInProgress(jobId, jobFile, emptySplit,
  94. jobtracker, conf, this, numMapTasks, 1);
  95. cleanup[0].setJobCleanupTask();
  96. // cleanup reduce tip.
  97. cleanup[1] = new TaskInProgress(jobId, jobFile, numMapTasks,
  98. numReduceTasks, jobtracker, conf, this, 1);
  99. cleanup[1].setJobCleanupTask();
  100. // create two setup tips, one map and one reduce.
  101. setup = new TaskInProgress[2];
  102. // setup map tip. This map doesn't use any split. Just assign an empty
  103. // split.
  104. setup[0] = new TaskInProgress(jobId, jobFile, emptySplit,
  105. jobtracker, conf, this, numMapTasks + 1, 1);
  106. setup[0].setJobSetupTask();
  107. // setup reduce tip.
  108. setup[1] = new TaskInProgress(jobId, jobFile, numMapTasks,
  109. numReduceTasks + 1, jobtracker, conf, this, 1);
  110. setup[1].setJobSetupTask();
  111. synchronized(jobInitKillStatus){
  112. jobInitKillStatus.initDone = true;
  113. // set this before the throw to make sure cleanup works properly
  114. tasksInited = true;
  115. if(jobInitKillStatus.killed) {
  116. throw new KillInterruptedException("Job " + jobId + " killed in init");
  117. }
  118. }
  119. JobHistory.JobInfo.logInited(profile.getJobID(), this.launchTime,
  120. numMapTasks, numReduceTasks);
  121. // Log the number of map and reduce tasks
  122. LOG.info("Job " + jobId + " initialized successfully with " + numMapTasks
  123. + " map tasks and " + numReduceTasks + " reduce tasks.");
  124. }

總結

看Hadoop源碼我覺得更好的方式是理解它的架構和通信機制,整體上有把握,對於細節在重要的地方例如任務提交,具體的任務分配的流程是怎麼樣的,代碼中的異步機制等等多加留心,至於每一次封裝和有些非重點對象可以不必仔細去看,畢竟不是真正需要基於Hadoop搞二次開發。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章