Kubernetes 原生 CI/CD 构建框架 Argo 详解!

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作者:FogDong(","attrs":{}},{"type":"link","attrs":{"href":"https://www.volcengine.cn?utm_source=infoQ&utm_medium=Media&utm_term=free&utm_campaign=20210204&utm_content=argo","title":""},"content":[{"type":"text","text":"字节跳动火山引擎","attrs":{}}]},{"type":"text","text":")","attrs":{}}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"什么是流水线?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在计算机中,流水线是把一个重复的过程分解为若干个子过程,使每个子过程可以与其他子过程并行进行的技术,也叫 Pipeline。由于这种工作方式与工厂中的生产流水线十分相似, 因此也被称为流水线技术。从本质上讲,流水线技术是一种时间并行技术。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以我们最熟悉的“构建镜像”过程为例:如下图,在每一次构建镜像中,我们都需要首先拉下代码仓库中的代码,进行代码构建,接着打出镜像,推往镜像仓库。每一次代码更改过后,这一过程都是不变的。使用流水线工具可以极大的提升这一过程的效率,只需要进行简单的配置便可以轻松的完成重复性的工作。这样的过程也被称之为 CI。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/3f/3f07c068781c7feb39ad4f3749c6f7e7.webp","alt":"图片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上图流程中使用的是 Jenkins。Jenkins 作为老牌流水线框架被大家所熟知。在云原生时代,Jenkins 推出了 Jenkins X 作为基于 Kubernetes 的新一代流水线,另外云原生时代还诞生了两大流水线框架—— Argo 和 Tekton。本文就详细介绍了 Argo 的相关内容。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Argo","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Argo Workflows 是一个开源的容器原生的工作流引擎,用于在 Kubernetes 上编排并行作业。Argo Workflows 实现为 Kubernetes CRD。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Quick Start","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"kubectl create ns argo\nkubectl apply -n argo -f https://raw.githubusercontent.com/argoproj/argo/stable/manifests/quick-start-postgres.yaml","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Argo 基于 Kubernetes,可以直接使用 kubectl 安装,安装的组件主要包括了一些 CRD 以及对应的 controller 和一个 server。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"注意,上述安装只会执行同 namespace 内的 Workflow,cluster install 详见 文档","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"三级定义","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要了解 Argo 定义的 CRD,先从其中的三级定义入手。概念上的从大到小分别为 WorkflowTemplate,Workflow,template。这些资源的命名有些相似,所以会稍微有些迷惑性。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Template","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"从最简单的 template 说起,一个 template 有多种类型,分别为 container,script,dag,steps,resource 以及 suspend。对于 template,我们可以简单的将其理解为一个 Pod ——container/script/resource 类型的 template 都会去实际控制一个 Pod。而 dag/steps 类型的 template 则是由多个基础类型的 template (container/script/resource)组成的。","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"container","attrs":{}},{"type":"text","text":":最常见的模板类型,与 Kubernetes container spec 保持一致。","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"script","attrs":{}},{"type":"text","text":":该类型基于 Container,支持用户在 template 定义一段脚本,另有一个 Source 字段来表示脚本的运行环境。","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"resource","attrs":{}},{"type":"text","text":":该类型支持我们在 template 中对 kubernetes 的资源进行操作,有一个 action 字段可以指定操作类型,如 create, apply, delete 等,并且支持设定相关的成功与失败条件用于判断该 template 的成功与失败。","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"suspend","attrs":{}},{"type":"text","text":":Suspend template 将在一段时间内或在手动恢复执行之前暂停执行。可以从 CLI (使用 argo resume)、API 或 UI 恢复执行。","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"steps","attrs":{}},{"type":"text","text":":Steps Template 允许用户以一系列步骤定义任务。在 Steps 中,[--] 代表顺序执行,[-] 代表并行执行。","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"dag","attrs":{}},{"type":"text","text":":DAG template 允许用户将任务定义为带依赖的有向无环图。在 DAG 中,通过 dependencies设置在特定任务开始之前必须完成的其他任务。没有任何依赖项的任务将立即运行。有关 DAG 的详细逻辑可见","attrs":{}},{"type":"link","attrs":{"href":"https://github.com/argoproj/argo/blob/master/workflow/controller/dag.go#L204","title":""},"content":[{"type":"text","text":"源码","attrs":{}}]},{"type":"text","text":"。","attrs":{}}]}],"attrs":{}}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Workflow","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在一个 Workflow 中,其 spec 中有一个名为 templates 的字段,在其中至少需要一个 template 作为其组成的任务。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一个最简单的 hello world 例子如下:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"apiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n generateName: hello-world-\n labels:\n workflows.argoproj.io/archive-strategy: \"false\"\nspec:\n entrypoint: whalesay\n templates:\n - name: whalesay\n container:\n image: docker/whalesay:latest\n command: [cowsay]\n args: [\"hello world\"]","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在这个例子中,该 Workflow 的 templates 字段中指定了一个类型为 container 的 template,使用了 whalesay 镜像。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接着,来看一个稍微复杂一点的 workflow:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"apiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n generateName: steps-\nspec:\n entrypoint: hello-hello-hello\n\n # 在 templates 中有两个 template,一个为 hello-hello-hello,一个为 whalesay\n templates:\n - name: hello-hello-hello\n # Instead of just running a container\n # This template has a sequence of steps\n steps: # 该 template 的类型是 steps \n - - name: hello1 # 在 steps 类型中,[--] 代表顺序执行,[-] 代表并行执行 \n template: whalesay # 这里引用了下面的 template\n arguments:\n parameters:\n - name: message\n value: \"hello1\"\n - - name: hello2a # 两个短杠 [--] => 顺序执行\n template: whalesay\n arguments:\n parameters:\n - name: message\n value: \"hello2a\"\n - name: hello2b # 一个短杠 [-] => 并行执行\n template: whalesay\n arguments:\n parameters:\n - name: message\n value: \"hello2b\"\n### 第二个 template\n - name: whalesay\n inputs:\n parameters:\n - name: message\n container:\n image: docker/whalesay\n command: [cowsay]\n args: [\"{{inputs.parameters.message}}\"]","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"WorkflowTemplate","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"WorkflowTemplate 相当于是 Workflow 的模板库,和 Workflow 一样,也由 template 组成。用户在创建完 WorkflowTemplate 后,可以通过直接提交它们来执行 Workflow。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"apiVersion: argoproj.io/v1alpha1\nkind: WorkflowTemplate\nmetadata:\n name: workflow-template-submittablespec:\n entrypoint: whalesay-template \n arguments:\n parameters:\n - name: message\n value: hello world\n templates:\n - name: whalesay-template\n inputs:\n parameters:\n - name: message\n container:\n image: docker/whalesay\n command: [cowsay]\n args: [\"{{inputs.parameters.message}}\"]","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Workflow Overview","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/62/62d5659cf5dfba26a96ca9b140a41edd.png","alt":"图片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在了解了 Argo 的三级定义后,我们首先来深入一下 Argo 中最为关键的定义,Workflow。Workflow 是Argo 中最重要的资源,有两个重要的功能:","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"它定义了要执行的工作流。","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"它存储了工作流的状态。","attrs":{}}]}],"attrs":{}}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由于这些双重职责,Workflow 应该被视为一个 Active 的对象。它不仅是一个静态定义,也是是上述定义的一个“实例”。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以看到 Workflow Template 的定义与 Workflow 几乎一致,除了类型不同。正因为 Workflow 既可以是一个定义也可以是一个实例,所以才需要 WorkflowTemplate 作为 Workflow 的模板,WorkflowTemplate 在定义后可以通过提交(Submit)来创建一个 Workflow。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而 Workflow 由一个 entrypoint 及一系列 template 组成,entrypoint 定义了这个 workflow 执行的入口,而 template 会实际去执行一个 Pod,其中,用户定义的内容会在 Pod 中以 Main Container 体现。此外,还有两个 Sidecar 来辅助运行。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Sidecar","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 Argo 中,这些 Sidecar 的镜像都是 argoexec。Argo 通过这个 executor 来完成一些流程控制。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Init","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"当用户的 template 中需要使用到 inputs 中的 artifact 或者是 script 类型时(script 类型需要注入脚本),Argo 都会为这个 pod 加上一个 Init Container —— 其镜像为 argoexec,而命令是 argoexec init。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在这个 Init Container 中,主要做的工作便是加载 artifact:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"func loadArtifacts() error {\n wfExecutor := initExecutor()\n defer wfExecutor.HandleError()\n defer stats.LogStats()\n\n // Download input artifacts\n err := wfExecutor.StageFiles()\n if err != nil {\n wfExecutor.AddError(err)\n return err\n }\n err = wfExecutor.LoadArtifacts()\n if err != nil {\n wfExecutor.AddError(err)\n return err\n }\n return nil\n}","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Wait","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除了 Resource 类型外的 template,Argo 都会注入一个 Wait Container,用于等待 Main Container 的完成并结束所有 Sidecar。这个 Wait Container 的镜像同样为 argoexec,而命令是 argoexec wait。(Resource 类型的不需要是因为 Resource 类型的 template 直接使用 argoexec 作为 Main Container 运行)","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"func waitContainer() error {\n wfExecutor := initExecutor()\n defer wfExecutor.HandleError()\n defer stats.LogStats()\n stats.StartStatsTicker(5 * time.Minute)\n\n defer func() {\n // Killing sidecar containers\n err := wfExecutor.KillSidecars()\n if err != nil {\n log.Errorf(\"Failed to kill sidecars: %s\", err.Error())\n }\n }()\n\n // Wait for main container to complete\n waitErr := wfExecutor.Wait()\n if waitErr != nil {\n wfExecutor.AddError(waitErr)\n // do not return here so we can still try to kill sidecars & save outputs\n }\n\n // Capture output script result\n err := wfExecutor.CaptureScriptResult()\n if err != nil {\n wfExecutor.AddError(err)\n return err\n }\n // Capture output script exit code\n err = wfExecutor.CaptureScriptExitCode()\n if err != nil {\n wfExecutor.AddError(err)\n return err\n }\n // Saving logs\n logArt, err := wfExecutor.SaveLogs()\n if err != nil {\n wfExecutor.AddError(err)\n return err\n }\n // Saving output parameters\n err = wfExecutor.SaveParameters()\n if err != nil {\n wfExecutor.AddError(err)\n return err\n }\n // Saving output artifacts\n err = wfExecutor.SaveArtifacts()\n if err != nil {\n wfExecutor.AddError(err)\n return err\n }\n err = wfExecutor.AnnotateOutputs(logArt)\n if err != nil {\n wfExecutor.AddError(err)\n return err\n }\n\n // To prevent the workflow step from completing successfully, return the error occurred during wait.\n if waitErr != nil {\n return waitErr\n }\n\n return nil\n}","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Inputs and Outputs","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在运行 Workflow 时,一个非常常见的场景是输出产物的传递。通常,一个 Step 的输出产物可以用作后续步骤的输入产物。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 Argo 中,产物可以通过 Artifact 或是 Parameter 传递。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Artifact","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要使用 Argo 的 Artifact,首先必须配置和使用 Artifact 存储仓库。具体的配置方式可以通过修改存有 Artifact Repository 信息的默认 Config Map 或者在 Workflow 中显示指定,详见 配置文档,在此不做赘述。 下表为 Argo 支持的仓库类型。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/46/4687b7ba687bfc5c74d5e830377eac6c.webp","alt":"图片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一个简单的使用了 Artifact 的例子如下:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"apiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n generateName: artifact-passing-spec:\n entrypoint: artifact-example\n templates:\n - name: artifact-example\n steps:\n - - name: generate-artifact\n template: whalesay\n - - name: consume-artifact\n template: print-message\n arguments:\n artifacts:\n # bind message to the hello-art artifact\n # generated by the generate-artifact step\n - name: message\n from: \"{{steps.generate-artifact.outputs.artifacts.hello-art}}\"\n\n - name: whalesay\n container:\n image: docker/whalesay:latest\n command: [sh, -c]\n args: [\"cowsay hello world | tee /tmp/hello_world.txt\"]\n outputs:\n artifacts:\n # generate hello-art artifact from /tmp/hello_world.txt\n # artifacts can be directories as well as files\n - name: hello-art\n path: /tmp/hello_world.txt\n\n - name: print-message\n inputs:\n artifacts:\n # unpack the message input artifact\n # and put it at /tmp/message\n - name: message\n path: /tmp/message\n container:\n image: alpine:latest\n command: [sh, -c]\n args: [\"cat /tmp/message\"]","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"默认情况下,Artifact 被打包为 tar 包和 gzip 包。也可以使用 archive 字段指定存档策略。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上面的例子里,名为 whalesay 的 template 使用 cowsay 命令生成一个名为 /tmp/hello-world.txt 的文件。然后将该文件作为一个名为 hello-art 的 Artifact 输出。名为 print-message 的 template 接受一个名为 message 的输入 Artifact,在 /tmp/message 的路径上解包它,然后使用 cat 命令打印 /tmp/message 的内容。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在前面的 Sidecar 介绍中提到过,Init Container 主要用于拉取 Artifact 产物。这些 Sidecar 正是产物传递的关键。接着,我们通过介绍另一种产物传递的方式来体验 Argo 中传递产物的关键。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Scripts","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"先来看一个简单的例子:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"apiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n generateName: scripts-bash-spec:\n entrypoint: bash-script-example\n templates:\n - name: bash-script-example\n steps:\n - - name: generate\n template: gen-random-int-bash\n - - name: print\n template: print-message\n arguments:\n parameters:\n - name: message\n value: \"{{steps.generate.outputs.result}}\" # The result of the here-script\n\n - name: gen-random-int-bash\n script:\n image: debian:9.4\n command: [bash]\n source: | # Contents of the here-script\n cat /dev/urandom | od -N2 -An -i | awk -v f=1 -v r=100 '{printf \"%i\\n\", f + r * $1 / 65536}'\n\n - name: gen-random-int-python\n script:\n image: python:alpine3.6\n command: [python]\n source: |\n import random\n i = random.randint(1, 100)\n print(i)\n\n - name: print-message\n inputs:\n parameters:\n - name: message\n container:\n image: alpine:latest\n command: [sh, -c]\n args: [\"echo result was: {{inputs.parameters.message}}\"]","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上面的例子中,有两个类型为 script 的 template,script 允许使用 source 规范脚本主体。这将创建一个包含脚本主体的临时文件,然后将临时文件的名称作为最后一个参数传递给 command(执行脚本主体的解释器),这样便可以方便的执行不同类型的脚本(bash、python、js etc)。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Script template 会将脚本的标准输出分配给一个名为 result 的特殊输出参数从而被其他 template 调用。在这里,通过 {{steps.generate.outputs.result}} 即可获取到名为 generate 的 template 的脚本输出。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"{{xxx}} 是 Argo 固定的变量替换格式:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"关于变量的格式详见 ","attrs":{}},{"type":"link","attrs":{"href":"https://github.com/argoproj/argo/blob/master/docs/variables.md","title":""},"content":[{"type":"text","text":"文档","attrs":{}}]},{"type":"text","text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"关于变量替换的逻辑详见 ","attrs":{}},{"type":"link","attrs":{"href":"https://github.com/argoproj/argo/blob/master/workflow/common/util.go#L305","title":""},"content":[{"type":"text","text":"源码","attrs":{}}]},{"type":"text","text":"。","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那么,容器内部应该如何获取这个脚本输出呢?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"还是回到 Sidecar,在 Wait Container 中,有这样一段逻辑:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"// CaptureScriptResult will add the stdout of a script template as output result\nfunc (we *WorkflowExecutor) CaptureScriptResult() error {\n \n ...\n\n log.Infof(\"Capturing script output\")\n mainContainerID, err := we.GetMainContainerID()\n if err != nil {\n return err\n }\n reader, err := we.RuntimeExecutor.GetOutputStream(mainContainerID, false)\n if err != nil {\n return err\n }\n defer func() { _ = reader.Close() }()\n bytes, err := ioutil.ReadAll(reader)\n if err != nil {\n return errors.InternalWrapError(err)\n }\n out := string(bytes)\n // Trims off a single newline for user convenience\n outputLen := len(out)\n if outputLen > 0 && out[outputLen-1] == '\\n' {\n out = out[0 : outputLen-1]\n }\n\n const maxAnnotationSize int = 256 * (1 << 10) // 256 kB\n // A character in a string is a byte\n if len(out) > maxAnnotationSize {\n log.Warnf(\"Output is larger than the maximum allowed size of 256 kB, only the last 256 kB were saved\")\n out = out[len(out)-maxAnnotationSize:]\n }\n\n we.Template.Outputs.Result = &out\n return nil\n}","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"再来看看这个 Wait Container 的 Volume Mount 情况:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"volumeMounts:\n - mountPath: /argo/podmetadata\n name: podmetadata\n - mountPath: /var/run/docker.sock\n name: docker-sock\n readOnly: true\n - mountPath: /argo/secret/my-minio-cred\n name: my-minio-cred\n readOnly: true\n - mountPath: /var/run/secrets/kubernetes.io/serviceaccount\n name: default-token-b5grl\n readOnly: true","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"现在就十分明确了,Wait Container 通过挂载 docker.sock 以及 service account,获取到 Main Container 中的输出结果,并保存到 Workflow 中。当然,也因为 Workflow 中保存了大量的信息,当一个 Workflow 的 Step 过多时,整个 Workflow 的结构会过于庞大。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Parameter","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Parameter 提供了一种通用机制,可以将步骤的结果用作参数。Parameter 的工作原理与脚本结果类似,除了输出参数的值会被设置为生成文件的内容,而不是 stdout 的内容。如:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":" - name: whalesay\n container:\n image: docker/whalesay:latest\n command: [sh, -c]\n args: [\"echo -n hello world > /tmp/hello_world.txt\"] # generate the content of hello_world.txt\n outputs:\n parameters:\n - name: hello-param # name of output parameter\n valueFrom:\n path: /tmp/hello_world.txt # set the value of hello-param to the contents of this hello-world.txt","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Volume","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"这并不是 Argo 处理产物传递的一种标准方式,但是通过共享存储,显然我们也能达到共通产物的结果。当然,若使用 Volume,我们则无需借助 Inputs 和 Outputs。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 Workflow 的 Spec 中,我们可以定义一个 Volume 模板:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"apiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n generateName: volumes-pvc-spec:\n entrypoint: volumes-pvc-example\n volumeClaimTemplates: # define volume, same syntax as k8s Pod spec\n - metadata:\n name: workdir # name of volume claim\n spec:\n accessModes: [ \"ReadWriteOnce\" ]\n resources:\n requests:\n storage: 1Gi # Gi => 1024 * 1024 * 1024","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"并在其他的 template 中 mount 该 volume:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":" - name: whalesay\n container:\n image: docker/whalesay:latest\n command: [sh, -c]\n args: [\"echo generating message in volume; cowsay hello world | tee /mnt/vol/hello_world.txt\"]\n # Mount workdir volume at /mnt/vol before invoking docker/whalesay\n volumeMounts: # same syntax as k8s Pod spec\n - name: workdir\n mountPath: /mnt/vol","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"其他流程控制功能","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"循环","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在编写 Workflow 时,能够循环迭代一组输入通常是非常有用的,如下例所示:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":" templates:\n - name: loop-example\n steps:\n - - name: print-message\n template: whalesay\n arguments:\n parameters:\n - name: message\n value: \"{{item}}\"\n withItems: # invoke whalesay once for each item in parallel\n - hello world # item 1\n - goodbye world # item 2","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在源码实现中,将会去判断 withItems,如果存在,则对其中的每个元素进行一次 step 的扩展。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"// expandStepGroup looks at each step in a collection of parallel steps, and expands all steps using withItems/withParam\nfunc (woc *wfOperationCtx) expandStepGroup(sgNodeName string, stepGroup []wfv1.WorkflowStep, stepsCtx *stepsContext) ([]wfv1.WorkflowStep, error) {\n newStepGroup := make([]wfv1.WorkflowStep, 0)\n for _, step := range stepGroup {\n if !step.ShouldExpand() {\n newStepGroup = append(newStepGroup, step)\n continue\n }\n expandedStep, err := woc.expandStep(step)\n if err != nil {\n return nil, err\n }\n if len(expandedStep) == 0 {\n // Empty list\n childNodeName := fmt.Sprintf(\"%s.%s\", sgNodeName, step.Name)\n if woc.wf.GetNodeByName(childNodeName) == nil {\n stepTemplateScope := stepsCtx.tmplCtx.GetTemplateScope()\n skipReason := \"Skipped, empty params\"\n woc.log.Infof(\"Skipping %s: %s\", childNodeName, skipReason)\n woc.initializeNode(childNodeName, wfv1.NodeTypeSkipped, stepTemplateScope, &step, stepsCtx.boundaryID, wfv1.NodeSkipped, skipReason)\n woc.addChildNode(sgNodeName, childNodeName)\n }\n }\n newStepGroup = append(newStepGroup, expandedStep...)\n }\n return newStepGroup, nil\n}","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"条件判断","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通过 when 关键字指定:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":" templates:\n - name: coinflip\n steps:\n # flip a coin\n - - name: flip-coin\n template: flip-coin\n # evaluate the result in parallel\n - - name: heads\n template: heads # call heads template if \"heads\"\n when: \"{{steps.flip-coin.outputs.result}} == heads\"\n - name: tails\n template: tails # call tails template if \"tails\"\n when: \"{{steps.flip-coin.outputs.result}} == tails\"","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"错误重尝","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":" templates:\n - name: retry-backoff\n retryStrategy:\n limit: 10\n retryPolicy: \"Always\"\n backoff:\n duration: \"1\" # Must be a string. Default unit is seconds. Could also be a Duration, e.g.: \"2m\", \"6h\", \"1d\"\n factor: 2\n maxDuration: \"1m\" # Must be a string. Default unit is seconds. Could also be a Duration, e.g.: \"2m\", \"6h\", \"1d\"","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"递归","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Template 可以递归地相互调用,这是一个非常实用的功能。例如在机器学习场景中:可以设定准确率必需满足一个值,否则就持续进行训练。在下面这个抛硬币例子中,我们可以持续抛硬币,直到出现正面才结束整个工作流。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"apiVersion: argoproj.io/v1alpha1kind: Workflowmetadata:\n generateName: coinflip-recursive-spec:\n entrypoint: coinflip\n templates:\n - name: coinflip\n steps:\n # flip a coin\n - - name: flip-coin\n template: flip-coin\n # evaluate the result in parallel\n - - name: heads\n template: heads # call heads template if \"heads\"\n when: \"{{steps.flip-coin.outputs.result}} == heads\"\n - name: tails # keep flipping coins if \"tails\"\n template: coinflip\n when: \"{{steps.flip-coin.outputs.result}} == tails\"\n\n - name: flip-coin\n script:\n image: python:alpine3.6\n command: [python]\n source: |\n import random\n result = \"heads\" if random.randint(0,1) == 0 else \"tails\"\n print(result)\n\n - name: heads\n container:\n image: alpine:3.6\n command: [sh, -c]\n args: [\"echo \\\"it was heads\\\"\"]","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以下是两次执行的结果,第一次执行直接抛到正面,结束流程;第二次重复三次后才抛到正面,结束流程。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"argo get coinflip-recursive-tzcb5\n\nSTEP PODNAME MESSAGE\n ✔ coinflip-recursive-vhph5\n ├───✔ flip-coin coinflip-recursive-vhph5-2123890397\n └─┬─✔ heads coinflip-recursive-vhph5-128690560\n └─○ tails\n\nSTEP PODNAME MESSAGE\n ✔ coinflip-recursive-tzcb5\n ├───✔ flip-coin coinflip-recursive-tzcb5-322836820\n └─┬─○ heads\n └─✔ tails\n ├───✔ flip-coin coinflip-recursive-tzcb5-1863890320\n └─┬─○ heads\n └─✔ tails\n ├───✔ flip-coin coinflip-recursive-tzcb5-1768147140\n └─┬─○ heads\n └─✔ tails\n ├───✔ flip-coin coinflip-recursive-tzcb5-4080411136\n └─┬─✔ heads coinflip-recursive-tzcb5-4080323273\n └─○ tails","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"退出处理","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"退出处理是一个指定在 workflow 结束时执行的 template,无论成功或失败。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"spec:\n entrypoint: intentional-fail\n onExit: exit-handler # invoke exit-handler template at end of the workflow\n templates:\n ...","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"对比 Tekton","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"相较于 Tekton 而言,Argo 的流程控制功能更加丰富。拥有着循环、递归等功能,这对于一些机器学习的场景都是十分适用的。而 Argo 社区对自己的定位也是 MLOps、AIOps、Data/Batch Processing,这也正是 Kubeflow Pipeline 底层基于 Argo 的原因(尽管 KFP 也在做 Tekton 的 backend)。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是在权限控制方面,Argo 做的就不如 Tekton;并且我个人认为,Tekton 的结构定义更为清晰。二者各有优劣,可以根据自己的需求进行选择。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"参考文档","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}}],"text":"Argo Roadmap:","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"underline","attrs":{}}],"text":"https://github.com/argoproj/argo/blob/master/docs/roadmap.md","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}}],"text":"Argo Examples:","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"underline","attrs":{}}],"text":"https://argoproj.github.io/argo/examples/#welcome","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}}],"text":"Argo Source Code:","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"underline","attrs":{}}],"text":"https://github.com/argoproj/argo","attrs":{}}]}],"attrs":{}}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章