Hadoop 2.0的源代碼中實現了兩個基於yarn的application,一個是MapReduce,另一個是被當做如何寫application的示例程序----Distributedshell,可以認爲它就是YARN的workcount示例程序.
distributedshell作用和它名字一樣,分佈式shell執行,將用戶提交的一串shell命令或者一個shell腳本,由ApplicationMaster控制,分配到不同的container中執行。
distributedshell的源代碼在"hadoop-yarn-project\hadoop-yarn\hadoop-yarn-applications\hadoop-yarn-applications-distributedshell"
包含了實現一個application的三個要求:
客戶端和RM
(Client.java)
客戶端提交application
AM和RM (ApplicationMaster.java)
註冊AM,申請分配container
AM和NM (ApplicationMaster.java)
啓動container
執行命令:
hadoop jar hadoop-yarn-applications-distributedshell-2.0.5-alpha.jar org.apache.hadoop.yarn.applications.distributedshell.Client -jar hadoop-yarn-applications-distributedshell-2.0.5-alpha.jar -shell_command '/bin/date' -num_containers 10
啓動10個container,每個都執行`date`命令
執行代碼流程:
1. 客戶端通過org.apache.hadoop.yarn.applications.distributedshell.Client提交application到RM,需提供ApplicationSubmissionContext
2. org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster提交containers請求,執行用戶提交的命令ContainerLaunchContext.commands
客戶端(Client.java):
1. YarnClient.getNewApplication
2. 填充ApplicationSubmissionContext,ContainerLaunchContext(啓動AM的Container)
3. YarnClient.submitApplication
4. 每隔一段時間調用YarnClient.getApplicationReport獲得Application Status
// 創建AM的上下文信息
ContainerLaunchContext amContainer = Records.newRecord(ContainerLaunchContext.class);
// 設置本地資源,AppMaster.jar包,log4j.properties
amContainer.setLocalResources(localResources);
// 環境變量,shell腳本在hdfs的地址, CLASSPATH
amContainer.setEnvironment(env);
// 設置啓動AM的命令和參數
Vector<CharSequence> vargs = new Vector<CharSequence>(30);
vargs.add("${JAVA_HOME}" + "/bin/java");
vargs.add("-Xmx" + amMemory + "m");
// AM主類
vargs.add("org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster?");
vargs.add("--container_memory " + String.valueOf(containerMemory));
vargs.add("--num_containers " + String.valueOf(numContainers));
vargs.add("--priority " + String.valueOf(shellCmdPriority));
if (!shellCommand.isEmpty()) {
vargs.add("--shell_command " + shellCommand + "");
}
if (!shellArgs.isEmpty()) {
vargs.add("--shell_args " + shellArgs + "");
}
for (Map.Entry<String, String> entry : shellEnv.entrySet()) {
vargs.add("--shell_env " + entry.getKey() + "=" + entry.getValue());
}
vargs.add("1>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/AppMaster.stdout");
vargs.add("2>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/AppMaster.stderr");
amContainer.setCommands(commands);
// 設置Resource需求,目前只設置memory
capability.setMemory(amMemory);
amContainer.setResource(capability);
appContext.setAMContainerSpec(amContainer);
// 提交application到RM
super.submitApplication(appContext);
ApplicationMaster(ApplicationMaster.java)
1. AMRMClient.registerApplicationMaster
2. 提供ContainerRequest到AMRMClient.addContainerRequest
3. 通過AMRMClient.allocate獲得container
4. container放入新建的LaunchContainerRunnable線程內執行
5. 創建ContainerLaunchContext,設置localResource,shellcommand, shellArgs等container啓動信息
6. ContainerManager.startContainer(startReq)
7. 下次RPC call後得到的Response信息,AMResponse.getCompletedContainersStatuses
8. AMRMClient.unregisterApplicationMaster
// 新建AMRMClient,2.1beta版本實現了異步AMRMClient,這裏還是同步的方式
resourceManager = new AMRMClientImpl(appAttemptID);
resourceManager.init(conf);
resourceManager.start();
// 向RM註冊自己
RegisterApplicationMasterResponse response = resourceManager
.registerApplicationMaster(appMasterHostname, appMasterRpcPort,
appMasterTrackingUrl);
while (numCompletedContainers.get() < numTotalContainers && !appDone) {
// 封裝Container請求,設置Resource需求,這邊只設置了memory
ContainerRequest containerAsk = setupContainerAskForRM(askCount);
resourceManager.addContainerRequest(containerAsk);
// Send the request to RM
LOG.info("Asking RM for containers" + ", askCount=" + askCount);
AMResponse amResp = sendContainerAskToRM();
// Retrieve list of allocated containers from the response
List<Container> allocatedContainers = amResp.getAllocatedContainers();
for (Container allocatedContainer : allocatedContainers) {
//新建一個線程來提交container啓動請求,這樣主線程就不會被block住了
LaunchContainerRunnable runnableLaunchContainer = new LaunchContainerRunnable(
allocatedContainer);
Thread launchThread = new Thread(runnableLaunchContainer);
launchThreads.add(launchThread);
launchThread.start();
}
List<ContainerStatus> completedContainers = amResp.getCompletedContainersStatuses();
}
// 向RM註銷自己
resourceManager.unregisterApplicationMaster(appStatus, appMessage, null);
附上AM的log信息
containerNode=dev81.hadoop:56100, containerNodeURI=dev81.hadoop:8042, containerStateNEW, containerResourceMemory1024
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Current available resources in the cluster <memory:26624, vCores:-5>
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Got response from RM for container ask, completedCnt=5
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Got container status for containerID=container_1376966186147_0006_01_000007, state=COMPLETE, exitStatus=0, diagnostics=
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Container completed successfully., containerId=container_1376966186147_0006_01_000007
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Got container status for containerID=container_1376966186147_0006_01_000008, state=COMPLETE, exitStatus=0, diagnostics=
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Connecting to ContainerManager at dev81.hadoop:56100
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Container completed successfully., containerId=container_1376966186147_0006_01_000008
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Got container status for containerID=container_1376966186147_0006_01_000009, state=COMPLETE, exitStatus=0, diagnostics=
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Container completed successfully., containerId=container_1376966186147_0006_01_000009
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Got container status for containerID=container_1376966186147_0006_01_000006, state=COMPLETE, exitStatus=0, diagnostics=
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Setting up container launch container for containerid=container_1376966186147_0006_01_000011
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Container completed successfully., containerId=container_1376966186147_0006_01_000006
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Setting user in ContainerLaunchContext to: hadoop
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Got container status for containerID=container_1376966186147_0006_01_000010, state=COMPLETE, exitStatus=0, diagnostics=
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Container completed successfully., containerId=container_1376966186147_0006_01_000010
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Current application state: loop=3, appDone=false, total=10, requested=10, completed=9, failed=0, currentAllocated=10
13/08/26 17:15:09 INFO distributedshell.ApplicationMaster: Current application state: loop=4, appDone=false, total=10, requested=10, completed=9, failed=0, currentAllocated=10
13/08/26 17:15:10 INFO distributedshell.ApplicationMaster: Asking RM for containers, askCount=0
13/08/26 17:15:10 INFO distributedshell.ApplicationMaster: Sending request to RM for containers, progress=0.9
13/08/26 17:15:10 INFO distributedshell.ApplicationMaster: Got response from RM for container ask, allocatedCnt=0
13/08/26 17:15:10 INFO distributedshell.ApplicationMaster: Current available resources in the cluster <memory:26624, vCores:-5>
13/08/26 17:15:10 INFO distributedshell.ApplicationMaster: Got response from RM for container ask, completedCnt=1
13/08/26 17:15:10 INFO distributedshell.ApplicationMaster: Got container status for containerID=container_1376966186147_0006_01_000011, state=COMPLETE, exitStatus=0, diagnostics=
13/08/26 17:15:10 INFO distributedshell.ApplicationMaster: Container completed successfully., containerId=container_1376966186147_0006_01_000011
13/08/26 17:15:10 INFO distributedshell.ApplicationMaster: Current application state: loop=4, appDone=true, total=10, requested=10, completed=10, failed=0, currentAllocated=10
13/08/26 17:15:10 INFO distributedshell.ApplicationMaster: Application completed. Signalling finish to RM
13/08/26 17:15:10 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.AMRMClientImpl is stopped.
13/08/26 17:15:10 INFO distributedshell.ApplicationMaster: Application Master completed successfully. exiting
參考例子: