hive on tez执行任务报错,did not succeed due to VERTEX_FAILURE

hive on tez,在执行任务的时候报错,这种情况原因是container资源被抢占或者是资源不足。而task最大的失败次数默认是4.

Status: Failed
Vertex failed, vertexName=Map 1, vertexId=vertex_1589254309642_0001_4_00, diagnostics=[Task failed, taskId=task_1589254309642_0001_4_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Container container_e136_1589254309642_0001_02_000012 finished with diagnostics set to [Container failed, exitCode=255. [2020-05-12 03:41:07.573]Exception from container-launch.
Container id: container_e136_1589254309642_0001_02_000012
Exit code: 255

Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1589254309642_0001_4_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1589254309642_0001_4_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1, Vertex vertex_1589254309642_0001_4_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1

在这里插入图片描述
解决方案:
1、am自己失败的最大重试次数,默认是2次。这里并不是说am自己挂了,只是因为一些系统原因导致失联了,命令行直接设置

set tez.am.max.app.attempts=5

如果解决了问题的话可以把它配置到配置文件(tez-site.xml)中
2、还有一个原因我认为container设置的内存太小,我本来是1G,改为4G后没啥问题了。
在hive-site.xml中,加入以下配置

<property>
    <name>hive.tez.container.size</name>
    <value>4096</value>
</property>
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章