Flink集羣之Current usage: 106.2 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used

出錯問題報告

org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:423)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:607)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$2(FlinkYarnSessionCli.java:810)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
	at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:810)
Caused by: org.apache.flink.yarn.AbstractYarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. 
Diagnostics from YARN: Application application_1585641509397_0006 failed 1 times due to AM Container for appattempt_1585641509397_0006_000001 exited with  exitCode: -103
For more detailed output, check application tracking page:http://master:8088/proxy/application_1585641509397_0006/Then, click on links to logs of each attempt.
Diagnostics: Container [pid=1925,containerID=container_1585641509397_0006_01_000001] is running beyond virtual memory limits. Current usage: 106.2 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1585641509397_0006_01_000001 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 1939 1925 1925 1925 (java) 269 223 2187988992 26862 /app/jdk1.8.0_191/bin/java -Xmx424m -Dlog.file=/app/hadoop-2.6.4/logs/userlogs/application_1585641509397_0006/container_1585641509397_0006_01_000001/jobmanager.log -Dlogback.configurationFile=file:logback.xml -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint 
	|- 1925 1923 1925 1925 (bash) 0 0 108609536 332 /bin/bash -c /app/jdk1.8.0_191/bin/java -Xmx424m  -Dlog.file=/app/hadoop-2.6.4/logs/userlogs/application_1585641509397_0006/container_1585641509397_0006_01_000001/jobmanager.log -Dlogback.configurationFile=file:logback.xml -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint  1> /app/hadoop-2.6.4/logs/userlogs/application_1585641509397_0006/container_1585641509397_0006_01_000001/jobmanager.out 2> /app/hadoop-2.6.4/logs/userlogs/application_1585641509397_0006/container_1585641509397_0006_01_000001/jobmanager.err 

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Failing this attempt. Failing the application.
If log aggregation is enabled on your cluster, use this command to further investigate the issue:
yarn logs -applicationId application_1585641509397_0006
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.startAppMaster(AbstractYarnClusterDescriptor.java:1065)
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:545)
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:416)
	... 7 more

------------------------------------------------------------
 The program finished with the following exception:

org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:423)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:607)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$2(FlinkYarnSessionCli.java:810)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
	at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:810)
Caused by: org.apache.flink.yarn.AbstractYarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. 
Diagnostics from YARN: Application application_1585641509397_0006 failed 1 times due to AM Container for appattempt_1585641509397_0006_000001 exited with  exitCode: -103
For more detailed output, check application tracking page:http://master:8088/proxy/application_1585641509397_0006/Then, click on links to logs of each attempt.
Diagnostics: Container [pid=1925,containerID=container_1585641509397_0006_01_000001] is running beyond virtual memory limits. Current usage: 106.2 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1585641509397_0006_01_000001 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 1939 1925 1925 1925 (java) 269 223 2187988992 26862 /app/jdk1.8.0_191/bin/java -Xmx424m -Dlog.file=/app/hadoop-2.6.4/logs/userlogs/application_1585641509397_0006/container_1585641509397_0006_01_000001/jobmanager.log -Dlogback.configurationFile=file:logback.xml -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint 
	|- 1925 1923 1925 1925 (bash) 0 0 108609536 332 /bin/bash -c /app/jdk1.8.0_191/bin/java -Xmx424m  -Dlog.file=/app/hadoop-2.6.4/logs/userlogs/application_1585641509397_0006/container_1585641509397_0006_01_000001/jobmanager.log -Dlogback.configurationFile=file:logback.xml -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint  1> /app/hadoop-2.6.4/logs/userlogs/application_1585641509397_0006/container_1585641509397_0006_01_000001/jobmanager.out 2> /app/hadoop-2.6.4/logs/userlogs/application_1585641509397_0006/container_1585641509397_0006_01_000001/jobmanager.err 

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Failing this attempt. Failing the application.
If log aggregation is enabled on your cluster, use this command to further investigate the issue:
yarn logs -applicationId application_1585641509397_0006
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.startAppMaster(AbstractYarnClusterDescriptor.java:1065)
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:545)
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:416)
	... 7 more
2020-03-31 15:17:07,311 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Cancelling deployment from Deployment Failure Hook
2020-03-31 15:17:07,311 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Killing YARN application
2020-03-31 15:17:07,324 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Deleting files in hdfs://master:9000/user/root/.flink/application_1585641509397_0006.

出錯點

遇到問題不要慌,仔細看報錯,“Diagnostics:…” 錯誤很明顯的
原因是Current usage: 106.2 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used. Killing container
字面原因是容器內存不夠,實際上是flink on yarn啓動時檢查虛擬內存造成的
解決方案:
修改配置文件,讓它不檢查就沒事了
修改etc/hadoop/yarn-site.xml

<property>
	<name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</value>
</property>

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章