用spark-submit啓動Spark應用程序

原創

2020-02-20 18:27

    bin/spark-submit腳本負責建立包含Spark以及其依賴的類路徑（classpath），它支持不同的集羣管理器以及Spark支持的加載模式。

    /bin/spark-submit \
    --class <main-class>
    --master <master-url> \
    --deploy-mode <deploy-mode> \
    --conf <key>=<value> \
    ... # other options
    <application-jar> \
    [application-arguments]

    一些常用的選項是：
        --class ：你的應用程序的入口點(如org.apache.spark.examples.SparkPi)
        --master：集羣的master URL(如spark://23.195.26.187:7077)
        --deploy-mode：在worker節點部署你的driver(cluster)或者本地作爲外部客戶端（client）。默認是client。
        --conf ：任意的Spark配置屬性，格式是key=value。
        application-jar ：包含應用程序以及其依賴的jar包的路徑。這個URL必須在集羣中全局可見，例如，存在於所有節點的 hdfs:// 路徑或 file:// 路徑
        application-arguments ：傳遞給主類的主方法的參數


    spark-submit所有的可用選項：

        # Run application locally on 8 cores
        ./bin/spark-submit \
        --class org.apache.spark.examples.SparkPi \
        --master local[8] \
        /path/to/examples.jar \
        100

        # Run on a Spark Standalone cluster in client deploy mode
        ./bin/spark-submit \
        --class org.apache.spark.examples.SparkPi \
        --master spark://207.184.161.138:7077 \
        --executor-memory 20G \
        --total-executor-cores 100 \
        /path/to/examples.jar \
        1000

        # Run on a Spark Standalone cluster in cluster deploy mode with supervise
        ./bin/spark-submit \
        --class org.apache.spark.examples.SparkPi \
        --master spark://207.184.161.138:7077 \
        --deploy-mode cluster
        --supervise
        --executor-memory 20G \
        --total-executor-cores 100 \
        /path/to/examples.jar \
        1000

        # Run on a YARN cluster
        export HADOOP_CONF_DIR=XXX
        ./bin/spark-submit \
        --class org.apache.spark.examples.SparkPi \
        --master yarn-cluster \ # can also be `yarn-client` for client mode
        --executor-memory 20G \
        --num-executors 50 \
        /path/to/examples.jar \
        1000

        # Run a Python application on a Spark Standalone cluster
        ./bin/spark-submit \
        --master spark://207.184.161.138:7077 \
        examples/src/main/python/pi.py \
        1000

淚痕殘

發佈了58 篇原創文章 · 獲贊 40 · 訪問量 7萬+

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

用spark-submit啓動Spark應用程序

NETCore中實現一個輕量無負擔的極簡任務調度ScheduleTask

docker使用特定的網絡

使用c#強大的表達式樹實現對象的深克隆之解決循環引用的問題

容器中nginx無法使用同一個網絡下的容器域名

避免DbContext同時在多個線程調用

GPT-4o 引領人機交互新風向，向量數據庫賽道沸騰了

free AI online tools All In One

痞子衡嵌入式：恩智浦i.MX RT1xxx系列MCU啓動那些事（12.A）- uSDHC eMMC啓動時間(RT1170)

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（二）使用kube-vip實現集羣VIP訪問

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（三）數據卷掛載NFS（網絡文件系統）

Java操作Hbase

Scala操作Hbase

Hbase基礎

Spark Streaming高級

用spark-submit啓動Spark應用程序

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結