Kubernetes环境中配置Spark Executor

Spark在执行任务时,需要访问到Executor的许多端口,而这些端口是随机的,又是通过主机名称访问。所以Kubernetes环境与大数据环境之间难以直接访问。可通过以下配置实现大数据集群访问到Kubernetes环境中运行的Spark Executor

1、Spark Executor在执行时,有许多随机端口,在K8S环境中运行时需要固定其端口,端口的范围为K8S集群NodePort分配的端口范围:30000-32767

#driver听的接口。用于和executors以及独立的master通信(默随机)
spark_driver_port: 30920
#driver的文件服听的端口(默随机)
spark_fileserver_port: 30921
#driver的HTTP广播服听的端口(默随机)
spark_broadcast_port: 30922
#driver的HTTP听的端口(默随机)
spark_replClassServer_port: 30923
#块管理器听的端口。些同存在于driverexecutors(默随机)
spark_blockManager_port: 30924
#executor监听的端口。用于与driver通信(默随机)
spark_executor_port: 30925

2、为Spark Executor创建一个StatefulSet,可以得到一个DNS域名:$(podname).(headless server name).namespace.svc.cluster.local

apiVersion: apps/v1
kind: StatefulSet
metadata:
    name: my-executor-statefulset
    namespace: [namespace]
    labels:
      app: my-executor-statefulset
spec:
    serviceName: my-executor
    replicas: 1
    selector:
        matchLabels:
            app: my-executor-pod
            version: [version]
    template:
        metadata:
            labels:
                app: my-executor-pod
                version: [version]
        spec:
            containers:
            - name: my-executor-pod
              image: 192.168.0.12:9090/eyes/my-executor-[namespace]:[version]-[ru]
              imagePullPolicy: Always
              ports:
                - containerPort: 5011
            hostAliases:
              - hostnames:
                  - hadoop-master01
                ip: 192.168.0.10
              - hostnames:
                  - hadoop-slave02
                ip: 192.168.0.11

3、为Spark Executor创建一个NodePort类型的Service,需要配置刚刚第一步配置好的固定端口

apiVersion: v1
kind: Service
metadata:
    name: my-executor-svc
    namespace: [namespace]
    labels:
      app: my-executor-pod
spec:
    ports:
    - port: 5011
      name: tcp-port
      protocol: TCP
    - port: 4040
      name: spark-http-port
      protocol: TCP
      nodePort: 30028
    - port: 30920
      name: spark-driver-port
      protocol: TCP
      nodePort: 30920
    - port: 30921
      name: spark-fileserver-port
      protocol: TCP
      nodePort: 30921
    - port: 30922
      name: spark-broadcast-port
      protocol: TCP
      nodePort: 30922
    - port: 30923
      name: spark-eplclassserver-port
      protocol: TCP
      nodePort: 30923
    - port: 30924
      name: spark-blockmanager-port
      protocol: TCP
      nodePort: 30924
    - port: 30925
      name: spark-executor-port
      protocol: TCP
      nodePort: 30925
    selector:
       app: my-executor-pod
    type: NodePort

4、在大数据环境的机器中全部配置hosts为StatefulSet的DNS域名:$(podname).(headless server name).namespace.svc.cluster.local,IP地址设置为K8S中的任意IP即可

192.168.0.12 my-executor-statefulset-0.my-executor.test2.svc.cluster.local


 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章