一文搞懂 Flink如何移動計算

對於分佈式框架來說,我們經常聽到的一句話就是:移動計算,不移動數據。那麼對於 Flink 來說是如何移動計算的呢?我們一起重點看一下 ExecuteGraph

基本概念

ExecutionJobVertex:表示 JobGraph 的一個計算頂點,每個 ExecutionJobVertex 可能會有很多個 並行的 ExecutionVertex
ExecutionVertex:表示一個並行的 subtask
Execution: 表示 ExecutionVertex 的一次嘗試執行

Graph 變化

源代碼

一文搞定 Flink Job 提交全流程 我們可以知道在 創建 jobMaster 的同時還 create executionGraph ,一路追蹤至 ExecutionGraphBuilder.buildGraph 方法

......
// topologically sort the job vertices and attach the graph to the existing one
        // 排好序的 topology  source->flatMap  Filter->sink
        // 一個 operator chain 形成一個 JobVertex 。single operator as a special operator chain
        List<JobVertex> sortedTopology = jobGraph.getVerticesSortedTopologicallyFromSources();
        if (log.isDebugEnabled()) {
            log.debug("Adding {} vertices from job graph {} ({}).", sortedTopology.size(), jobName, jobId);
        }
        executionGraph.attachJobGraph(sortedTopology);
        ......

進入 attachJobGraph

public void attachJobGraph(List<JobVertex> topologiallySorted) throws JobException {

        assertRunningInJobMasterMainThread();

        LOG.debug("Attaching {} topologically sorted vertices to existing job graph with {} " +
                "vertices and {} intermediate results.",
            topologiallySorted.size(),
            tasks.size(),
            intermediateResults.size());

        final ArrayList<ExecutionJobVertex> newExecJobVertices = new ArrayList<>(topologiallySorted.size());
        final long createTimestamp = System.currentTimeMillis();
        //從 source operator chain 開始
        for (JobVertex jobVertex : topologiallySorted) {

            if (jobVertex.isInputVertex() && !jobVertex.isStoppable()) {
                this.isStoppable = false;
            }

            /*
            //在這裏生成 ExecutionGraph 的每個節點
            //首先是進行了一堆賦值,將任務信息交給要生成的圖節點,以及設定並行度等等
            //然後是創建本節點的 IntermediateResult,根據本節點的下游節點的個數確定創建幾份
            //最後是根據設定好的並行度創建用於執行 task 的 ExecutionVertex
            //如果 job 有設定 inputsplit 的話,這裏還要指定 inputsplits
             */
            // create the execution job vertex and attach it to the graph
            // 已經開始並行化了
            ExecutionJobVertex ejv = new ExecutionJobVertex(
                this,
                jobVertex,
                1,
                rpcTimeout,
                globalModVersion,
                createTimestamp);

            /*
            //這裏要處理所有的JobEdge
            //對每個edge,獲取對應的intermediateResult,並記錄到本節點的輸入上
            //最後,把每個ExecutorVertex和對應的IntermediateResult關聯起來
             */
            ejv.connectToPredecessors(this.intermediateResults);

            ExecutionJobVertex previousTask = this.tasks.putIfAbsent(jobVertex.getID(), ejv);
            if (previousTask != null) {
                throw new JobException(String.format("Encountered two job vertices with ID %s : previous=[%s] / new=[%s]",
                    jobVertex.getID(), ejv, previousTask));
            }

            for (IntermediateResult res : ejv.getProducedDataSets()) {
                IntermediateResult previousDataSet = this.intermediateResults.putIfAbsent(res.getId(), res);
                if (previousDataSet != null) {
                    throw new JobException(String.format("Encountered two intermediate data set with ID %s : previous=[%s] / new=[%s]",
                        res.getId(), res, previousDataSet));
                }
            }

            this.verticesInCreationOrder.add(ejv);
            this.numVerticesTotal += ejv.getParallelism();
            newExecJobVertices.add(ejv);
        }

        terminationFuture = new CompletableFuture<>();
        failoverStrategy.notifyNewVertices(newExecJobVertices);
    }

關鍵性方法 new ExecutionJobVertex,除了進行了一些基本的賦值操作外,還並行化了 intermediateResult,並行化了 ExecutionVertex。
說白點,就是創建了幾個特別相似的 intermediateResult 對象以及 ExecutionVertex 對象,具體如下

// 已經開始並行化了
    public ExecutionJobVertex(
            ExecutionGraph graph,
            JobVertex jobVertex,
            int defaultParallelism,
            Time timeout,
            long initialGlobalModVersion,
            long createTimestamp) throws JobException {

        if (graph == null || jobVertex == null) {
            throw new NullPointerException();
        }

        this.graph = graph;
        this.jobVertex = jobVertex;

        int vertexParallelism = jobVertex.getParallelism();
        // 最終的並行度
        int numTaskVertices = vertexParallelism > 0 ? vertexParallelism : defaultParallelism;

        final int configuredMaxParallelism = jobVertex.getMaxParallelism();

        this.maxParallelismConfigured = (VALUE_NOT_SET != configuredMaxParallelism);

        // if no max parallelism was configured by the user, we calculate and set a default
        setMaxParallelismInternal(maxParallelismConfigured ?
                configuredMaxParallelism : KeyGroupRangeAssignment.computeDefaultMaxParallelism(numTaskVertices));

        // verify that our parallelism is not higher than the maximum parallelism
        if (numTaskVertices > maxParallelism) {
            throw new JobException(
                String.format("Vertex %s's parallelism (%s) is higher than the max parallelism (%s). Please lower the parallelism or increase the max parallelism.",
                    jobVertex.getName(),
                    numTaskVertices,
                    maxParallelism));
        }

        this.parallelism = numTaskVertices;

        this.taskVertices = new ExecutionVertex[numTaskVertices];
        this.operatorIDs = Collections.unmodifiableList(jobVertex.getOperatorIDs());
        this.userDefinedOperatorIds = Collections.unmodifiableList(jobVertex.getUserDefinedOperatorIDs());

        this.inputs = new ArrayList<>(jobVertex.getInputs().size());

        // take the sharing group
        this.slotSharingGroup = jobVertex.getSlotSharingGroup();
        this.coLocationGroup = jobVertex.getCoLocationGroup();

        // setup the coLocation group
        if (coLocationGroup != null && slotSharingGroup == null) {
            throw new JobException("Vertex uses a co-location constraint without using slot sharing");
        }

        // create the intermediate results
        this.producedDataSets = new IntermediateResult[jobVertex.getNumberOfProducedIntermediateDataSets()];

        // intermediateResult 開始並行化
        for (int i = 0; i < jobVertex.getProducedDataSets().size(); i++) {
            final IntermediateDataSet result = jobVertex.getProducedDataSets().get(i);

            this.producedDataSets[i] = new IntermediateResult(
                    result.getId(),
                    this,
                    numTaskVertices,
                    result.getResultType());
        }

        Configuration jobConfiguration = graph.getJobConfiguration();
        int maxPriorAttemptsHistoryLength = jobConfiguration != null ?
                jobConfiguration.getInteger(JobManagerOptions.MAX_ATTEMPTS_HISTORY_SIZE) :
                JobManagerOptions.MAX_ATTEMPTS_HISTORY_SIZE.defaultValue();

        // create all task vertices
        // 移動計算
        // ExecutionVertex 開始並行化
        for (int i = 0; i < numTaskVertices; i++) {
            ExecutionVertex vertex = new ExecutionVertex(
                    this,
                    i,
                    producedDataSets,
                    timeout,
                    initialGlobalModVersion,
                    createTimestamp,
                    maxPriorAttemptsHistoryLength);

            this.taskVertices[i] = vertex;
        }

        // sanity check for the double referencing between intermediate result partitions and execution vertices
        for (IntermediateResult ir : this.producedDataSets) {
            if (ir.getNumberOfAssignedPartitions() != parallelism) {
                throw new RuntimeException("The intermediate result's partitions were not correctly assigned.");
            }
        }

        // set up the input splits, if the vertex has any
        try {
            @SuppressWarnings("unchecked")
            InputSplitSource<InputSplit> splitSource = (InputSplitSource<InputSplit>) jobVertex.getInputSplitSource();

            if (splitSource != null) {
                Thread currentThread = Thread.currentThread();
                ClassLoader oldContextClassLoader = currentThread.getContextClassLoader();
                currentThread.setContextClassLoader(graph.getUserClassLoader());
                try {
                    inputSplits = splitSource.createInputSplits(numTaskVertices);

                    if (inputSplits != null) {
                        splitAssigner = splitSource.getInputSplitAssigner(inputSplits);
                    }
                } finally {
                    currentThread.setContextClassLoader(oldContextClassLoader);
                }
            }
            else {
                inputSplits = null;
            }
        }
        catch (Throwable t) {
            throw new JobException("Creating the input splits caused an error: " + t.getMessage(), t);
        }
    }

至此移動計算,就算清楚了

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章