Spark UI vs. Spark History Server UI

Is Job Running ?

1. If you have Spark Applications Running, then you should be using SPARK UI. This UI is usually hosted on Spark Driver

– In YARN cluster mode, the Driver is run on YARN Application Master run on random Core node )

– IN YARN Client  Mode, the Driver is run on Master node itself.

To access Spark UI, You should be going to  YARN ResourceManager UI First. Then navigate to corresponding Spark Application and use “Application Master” link to Access Spark UI. If you observe the link, its taking you  you to the application master’s web UI at port 20888. This is basically a proxy running on master  listening on 20888  which makes available the Spark UI(which runs on either Core node or Master node)


2. You can also access Spark UI by going directly to Driver Hostname and Portname where its hosted.

For example, when I run spark-submit in cluster mode, it spinned up application_1569345960040_0007. In my driver logs I see below messages

19/09/24 22:29:15 INFO Utils: Successfully started service ‘SparkUI’ on port 35395.

19/09/24 22:29:15 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at

http://ip-10-0-0-69.myermdomain.com:35395

Where ip-10-0-0-69.myermdomain.com is one of my core node.

So I can go to

http://ip-10-0-0-69.myermdomain.com:35395

This automatically routes me to Master node proxy server listening on port 20888

 http://ip-10-0-0-113.ec2.internal:20888/proxy/application_1569345960040_0007/

Please note that, these links are temporary and will only show the UI while the Spark Application is running.


Is Job Completed ?

But if you want to see UI even when Spark job is completed, you should use Spark HistoryServer UI directly at http://master-public-dns-name:18080/.

Spark History Server can also be used for Running Jobs using “Show Incomplete Applications” Button. Spark History Server does this by using Spark Event logs which is enabled on EMR by default.


Differences between Spark UI and Spark History UI

 But looks like Spark History Server has some  differences when compared to “Spark UI” (For Running Apps of course ). Some of em’ that I observed are :

– Spark UI has “Kill” Button so your can kill some Spark Stages while Spark History Server doesn’t.

– SPark UI has “SQL” tab which shows more information about spark-sql jobs while Spark History Server doesn’t.

– Spark UI can pull up live  Thread Dumps for Executors  while Spark History Server doesn’t.

– Spark UI can give most update to date info(like “Total Uptime”) on Tasks while there can be a bit lag in  Spark History Server UI.

發佈了127 篇原創文章 · 獲贊 76 · 訪問量 45萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章