一文搞懂 Flink如何移動計算

對於分佈式框架來說，我們經常聽到的一句話就是：移動計算，不移動數據。那麼對於 Flink 來說是如何移動計算的呢？我們一起重點看一下 ExecuteGraph

基本概念

ExecutionJobVertex：表示 JobGraph 的一個計算頂點，每個 ExecutionJobVertex 可能會有很多個並行的 ExecutionVertex
ExecutionVertex：表示一個並行的 subtask
Execution: 表示 ExecutionVertex 的一次嘗試執行

Graph 變化

源代碼

由一文搞定 Flink Job 提交全流程我們可以知道在創建 jobMaster 的同時還 create executionGraph ，一路追蹤至 ExecutionGraphBuilder.buildGraph 方法

......
// topologically sort the job vertices and attach the graph to the existing one
        // 排好序的 topology  source->flatMap  Filter->sink
        // 一個 operator chain 形成一個 JobVertex 。single operator as a special operator chain
        List<JobVertex> sortedTopology = jobGraph.getVerticesSortedTopologicallyFromSources();
        if (log.isDebugEnabled()) {
            log.debug("Adding {} vertices from job graph {} ({}).", sortedTopology.size(), jobName, jobId);
        }
        executionGraph.attachJobGraph(sortedTopology);
        ......

進入 attachJobGraph

public void attachJobGraph(List<JobVertex> topologiallySorted) throws JobException {

        assertRunningInJobMasterMainThread();

        LOG.debug("Attaching {} topologically sorted vertices to existing job graph with {} " +
                "vertices and {} intermediate results.",
            topologiallySorted.size(),
            tasks.size(),
            intermediateResults.size());

        final ArrayList<ExecutionJobVertex> newExecJobVertices = new ArrayList<>(topologiallySorted.size());
        final long createTimestamp = System.currentTimeMillis();
        //從 source operator chain 開始
        for (JobVertex jobVertex : topologiallySorted) {

            if (jobVertex.isInputVertex() && !jobVertex.isStoppable()) {
                this.isStoppable = false;
            }

            /*
            //在這裏生成 ExecutionGraph 的每個節點
            //首先是進行了一堆賦值，將任務信息交給要生成的圖節點，以及設定並行度等等
            //然後是創建本節點的 IntermediateResult，根據本節點的下游節點的個數確定創建幾份
            //最後是根據設定好的並行度創建用於執行 task 的 ExecutionVertex
            //如果 job 有設定 inputsplit 的話，這裏還要指定 inputsplits
             */
            // create the execution job vertex and attach it to the graph
            // 已經開始並行化了
            ExecutionJobVertex ejv = new ExecutionJobVertex(
                this,
                jobVertex,
                1,
                rpcTimeout,
                globalModVersion,
                createTimestamp);

            /*
            //這裏要處理所有的JobEdge
            //對每個edge，獲取對應的intermediateResult，並記錄到本節點的輸入上
            //最後，把每個ExecutorVertex和對應的IntermediateResult關聯起來
             */
            ejv.connectToPredecessors(this.intermediateResults);

            ExecutionJobVertex previousTask = this.tasks.putIfAbsent(jobVertex.getID(), ejv);
            if (previousTask != null) {
                throw new JobException(String.format("Encountered two job vertices with ID %s : previous=[%s] / new=[%s]",
                    jobVertex.getID(), ejv, previousTask));
            }

            for (IntermediateResult res : ejv.getProducedDataSets()) {
                IntermediateResult previousDataSet = this.intermediateResults.putIfAbsent(res.getId(), res);
                if (previousDataSet != null) {
                    throw new JobException(String.format("Encountered two intermediate data set with ID %s : previous=[%s] / new=[%s]",
                        res.getId(), res, previousDataSet));
                }
            }

            this.verticesInCreationOrder.add(ejv);
            this.numVerticesTotal += ejv.getParallelism();
            newExecJobVertices.add(ejv);
        }

        terminationFuture = new CompletableFuture<>();
        failoverStrategy.notifyNewVertices(newExecJobVertices);
    }

關鍵性方法 new ExecutionJobVertex，除了進行了一些基本的賦值操作外，還並行化了 intermediateResult，並行化了 ExecutionVertex。
說白點，就是創建了幾個特別相似的 intermediateResult 對象以及 ExecutionVertex 對象，具體如下

// 已經開始並行化了
    public ExecutionJobVertex(
            ExecutionGraph graph,
            JobVertex jobVertex,
            int defaultParallelism,
            Time timeout,
            long initialGlobalModVersion,
            long createTimestamp) throws JobException {

        if (graph == null || jobVertex == null) {
            throw new NullPointerException();
        }

        this.graph = graph;
        this.jobVertex = jobVertex;

        int vertexParallelism = jobVertex.getParallelism();
        // 最終的並行度
        int numTaskVertices = vertexParallelism > 0 ? vertexParallelism : defaultParallelism;

        final int configuredMaxParallelism = jobVertex.getMaxParallelism();

        this.maxParallelismConfigured = (VALUE_NOT_SET != configuredMaxParallelism);

        // if no max parallelism was configured by the user, we calculate and set a default
        setMaxParallelismInternal(maxParallelismConfigured ?
                configuredMaxParallelism : KeyGroupRangeAssignment.computeDefaultMaxParallelism(numTaskVertices));

        // verify that our parallelism is not higher than the maximum parallelism
        if (numTaskVertices > maxParallelism) {
            throw new JobException(
                String.format("Vertex %s's parallelism (%s) is higher than the max parallelism (%s). Please lower the parallelism or increase the max parallelism.",
                    jobVertex.getName(),
                    numTaskVertices,
                    maxParallelism));
        }

        this.parallelism = numTaskVertices;

        this.taskVertices = new ExecutionVertex[numTaskVertices];
        this.operatorIDs = Collections.unmodifiableList(jobVertex.getOperatorIDs());
        this.userDefinedOperatorIds = Collections.unmodifiableList(jobVertex.getUserDefinedOperatorIDs());

        this.inputs = new ArrayList<>(jobVertex.getInputs().size());

        // take the sharing group
        this.slotSharingGroup = jobVertex.getSlotSharingGroup();
        this.coLocationGroup = jobVertex.getCoLocationGroup();

        // setup the coLocation group
        if (coLocationGroup != null && slotSharingGroup == null) {
            throw new JobException("Vertex uses a co-location constraint without using slot sharing");
        }

        // create the intermediate results
        this.producedDataSets = new IntermediateResult[jobVertex.getNumberOfProducedIntermediateDataSets()];

        // intermediateResult 開始並行化
        for (int i = 0; i < jobVertex.getProducedDataSets().size(); i++) {
            final IntermediateDataSet result = jobVertex.getProducedDataSets().get(i);

            this.producedDataSets[i] = new IntermediateResult(
                    result.getId(),
                    this,
                    numTaskVertices,
                    result.getResultType());
        }

        Configuration jobConfiguration = graph.getJobConfiguration();
        int maxPriorAttemptsHistoryLength = jobConfiguration != null ?
                jobConfiguration.getInteger(JobManagerOptions.MAX_ATTEMPTS_HISTORY_SIZE) :
                JobManagerOptions.MAX_ATTEMPTS_HISTORY_SIZE.defaultValue();

        // create all task vertices
        // 移動計算
        // ExecutionVertex 開始並行化
        for (int i = 0; i < numTaskVertices; i++) {
            ExecutionVertex vertex = new ExecutionVertex(
                    this,
                    i,
                    producedDataSets,
                    timeout,
                    initialGlobalModVersion,
                    createTimestamp,
                    maxPriorAttemptsHistoryLength);

            this.taskVertices[i] = vertex;
        }

        // sanity check for the double referencing between intermediate result partitions and execution vertices
        for (IntermediateResult ir : this.producedDataSets) {
            if (ir.getNumberOfAssignedPartitions() != parallelism) {
                throw new RuntimeException("The intermediate result's partitions were not correctly assigned.");
            }
        }

        // set up the input splits, if the vertex has any
        try {
            @SuppressWarnings("unchecked")
            InputSplitSource<InputSplit> splitSource = (InputSplitSource<InputSplit>) jobVertex.getInputSplitSource();

            if (splitSource != null) {
                Thread currentThread = Thread.currentThread();
                ClassLoader oldContextClassLoader = currentThread.getContextClassLoader();
                currentThread.setContextClassLoader(graph.getUserClassLoader());
                try {
                    inputSplits = splitSource.createInputSplits(numTaskVertices);

                    if (inputSplits != null) {
                        splitAssigner = splitSource.getInputSplitAssigner(inputSplits);
                    }
                } finally {
                    currentThread.setContextClassLoader(oldContextClassLoader);
                }
            }
            else {
                inputSplits = null;
            }
        }
        catch (Throwable t) {
            throw new JobException("Creating the input splits caused an error: " + t.getMessage(), t);
        }
    }

至此移動計算，就算清楚了

一文搞懂 Flink如何移動計算

基本概念

Graph 變化

源代碼

【SQL進階】CASE語句的使用

npm error Cannot read properties of null (reading 'isDescendantOf')

一文搞懂 Flink 中的鎖

Flink 計算 PV UV

Flink 計算 TopN

一文搞懂 flink 處理水印全過程

一文搞懂 Flink Stream Join原理

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結