Hive on spark下insert overwrite partition慢的優化

    Hive版本: 2.1.1, Spark版本是1.6.0

    這幾天發現insert overwrite partition運行的很慢,看了下是hive on spark引擎,這引擎平時比mapreduce快多了,但是怎麼今天感覺比mapreduce慢了好幾倍,運行了1h多還沒運行完。

    將SQL拿來手動hive -f 文件.sql執行了,看到spark的stage狀態一直都是處於0,幾乎沒有改變,如List-1所示。

    List-1

[xx@xxxx xx]# hive -f sql.sql 
...
Query ID = root_20200807155008_80726145-e8f2-4f4e-8222-94083907a70c
Total jobs = 1
Launching Job 1 out of 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Spark Job = d5e51d11-0254-49e3-93c7-f1380a89b3d5
Running with YARN Application = application_1593752968338_0506
Kill Command = /usr/local/hadoop/bin/yarn application -kill application_1593752968338_0506

Query Hive on Spark job[0] stages:
0

Status: Running (Hive on Spark job[0])
Job Progress Format
CurrentTime StageId_StageAttemptId: SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount [StageCost]
2020-08-07 15:50:47,501	Stage-0_0: 0(+2)/3	
2020-08-07 15:50:50,530	Stage-0_0: 0(+2)/3	
2020-08-07 15:50:53,555	Stage-0_0: 0(+2)/3	
2020-08-07 15:50:56,582	Stage-0_0: 0(+2)/3	
2020-08-07 15:50:57,590	Stage-0_0: 0(+3)/3	
2020-08-07 15:51:00,620	Stage-0_0: 0(+3)/3	
2020-08-07 15:51:03,641	Stage-0_0: 0(+3)/3	
2020-08-07 15:51:06,662	Stage-0_0: 0(+3)/3	
2020-08-07 15:51:09,680	Stage-0_0: 0(+3)/3	
2020-08-07 15:51:12,700	Stage-0_0: 0(+3)/3	
...

    運行1h多了,但是還是處於那個狀態,感覺不對立即搜索了下,別人也遇到了這個問題,沒找到好的解決方法

    我暫時對這個任務設置mr作爲執行引擎——使用set hive.execution.engine=mr,不使用spark作爲引擎,這樣就解決了一直卡住不動的問題

    之後hive又報錯了,提示超過了單個node的max partition數,如List-2

    List-2

...
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:499)
	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
	... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveFatalException: [Error 20004]: Fatal error occurred when node tried to create too many dynamic partitions. The maximum number of dynamic partitions is controlled by hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode. Maximum was set to 100 partitions per node, number of dynamic partitions on this node: 101
	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:933)
	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:704)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
	at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
	at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:149)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
	... 9 more


FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 3   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
...

    再設置partitions和partitions.pernode,如下List-3

    List-3

set hive.execution.engine=mr;
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=100000;
set hive.exec.max.dynamic.partitions=100000;
...

    這個問題,google了下,在Spark的jira issue裏面有,說是個bug,後面修復了。

    這樣就解決了,但是mr還是慢,沒辦法要麼更換hive/spark版本,要麼自己去修改spark源碼,先用mr暫時解決下。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章