Oozie fork與subworkflow

在使用Oozie的時候,如果要實現兩個action並行跑,可以使用fork;

如果要實現一個workflow中調用另一個workflow,可以使用subworkflow;

這裏介紹一下如何在oozie中同時使用fork與subworkflow

文件結構如圖:

在這裏插入圖片描述

fork:

p1.sh

#!/bin/bash
date > date.log
/home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -put -f date.log /user/hadoop/

workflow.xml

<!--
  Licensed to the Apache Software Foundation (ASF) under one
  or more contributor license agreements.  See the NOTICE file
  distributed with this work for additional information
  regarding copyright ownership.  The ASF licenses this file
  to you under the Apache License, Version 2.0 (the
  "License"); you may not use this file except in compliance
  with the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->
<workflow-app
	xmlns="uri:oozie:workflow:0.4" name="shell-wf">
	<start to="shell-node"/>
	<action name="shell-node">
		<shell
			xmlns="uri:oozie:shell-action:0.2">
			<job-tracker>${jobTracker}</job-tracker>
			<name-node>${nameNode}</name-node>
			<configuration>
				<property>
					<name>mapred.job.queue.name</name>
					<value>${queueName}</value>
				</property>
			</configuration>
			<exec>${EXEC}</exec>
			<!-- <argument>my_output=Hello Oozie</argument> -->
			<file>/user/hadoop/oozie-apps/subwf_fork/fork/${EXEC}#${EXEC}</file>
			<capture-output/>
		</shell>
		<ok to="fork"/>
		<error to="fail"/>
	</action>
	<fork name="fork">
		<path start="fork1"/>
		<path start="fork2"/>
	</fork>
	<action name="fork1">
		<sub-workflow>
			<app-path>/user/hadoop/oozie-apps/subwf_fork/fork1/workflow.xml</app-path>
			<propagate-configuration/>
		</sub-workflow>
		<ok to="joining"/>
		<error to="fail"/>
	</action>
	<action name="fork2">
		<sub-workflow>
			<app-path>/user/hadoop/oozie-apps/subwf_fork/fork2/workflow.xml</app-path>
			<propagate-configuration/>
		</sub-workflow>
		<ok to="joining"/>
		<error to="fail"/>
	</action>
	<join name="joining" to="end"/>
	<decision name="check-output">
		<switch>
			<case to="end">
            ${wf:actionData('shell-node')['my_output'] eq 'Hello Oozie'}
        </case>
			<default to="fail-output"/>
		</switch>
	</decision>
	<kill name="fail">
		<message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
	</kill>
	<kill name="fail-output">
		<message>Incorrect output, expected [Hello Oozie] but was [${wf:actionData('shell-node')['my_output']}]</message>
	</kill>
	<end name="end"/>
</workflow-app>
  • fork:定義並行任務
  • path:從哪個任務開始
  • sub-workflow:調用另外的workflow
  • propagate-configuration:自定義配置信息
  • joining:合併fork任務

詳細信息可以查看官網:https://oozie.apache.org/docs/4.2.0/WorkflowFunctionalSpec.html#a3.2.6_Sub-workflow_Action

fork1:

p1.sh

#!/bin/bash
/home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -cat /user/hadoop/date.log

workflow.xml

<!--
  Licensed to the Apache Software Foundation (ASF) under one
  or more contributor license agreements.  See the NOTICE file
  distributed with this work for additional information
  regarding copyright ownership.  The ASF licenses this file
  to you under the Apache License, Version 2.0 (the
  "License"); you may not use this file except in compliance
  with the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->
<workflow-app
	xmlns="uri:oozie:workflow:0.4" name="shell-wf">
	<start to="shell-node"/>
	<action name="shell-node">
		<shell
			xmlns="uri:oozie:shell-action:0.2">
			<job-tracker>${jobTracker}</job-tracker>
			<name-node>${nameNode}</name-node>
			<configuration>
				<property>
					<name>mapred.job.queue.name</name>
					<value>${queueName}</value>
				</property>
			</configuration>
			<exec>${EXEC}</exec>
			<!-- <argument>my_output=Hello Oozie</argument> -->
			<file>/user/hadoop/oozie-apps/subwf_fork/fork1/${EXEC}#${EXEC}</file>
			<capture-output/>
		</shell>
		<ok to="end"/>
		<error to="fail"/>
	</action>
	<decision name="check-output">
		<switch>
			<case to="end">
            ${wf:actionData('shell-node')['my_output'] eq 'Hello Oozie'}
        </case>
			<default to="fail-output"/>
		</switch>
	</decision>
	<kill name="fail">
		<message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
	</kill>
	<kill name="fail-output">
		<message>Incorrect output, expected [Hello Oozie] but was [${wf:actionData('shell-node')['my_output']}]</message>
	</kill>
	<end name="end"/>
</workflow-app>

fork2:

p1.sh

#!/bin/bash
echo 'Successfully!!' | /home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -put -f - /user/hadoop/oozie.log

job.properties

#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

#HDFS地址
nameNode=hdfs://hadoop01:8020
#ResourceManager地址
jobTracker=hadoop03:8032
#隊列名稱
queueName=default
examplesRoot=oozie-apps
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/subwf_fork/fork/workflow.xml
EXEC=p1.sh

上傳Linux,同步到HDFS,並提交任務

hdfs dfs -put /home/hadoop/cdh/oozie-4.0.0-cdh5.3.6/oozie-apps/subwf_fork /user/hadoop/oozie-apps/
oozie job -oozie http://hadoop03:11000/oozie -config /home/hadoop/cdh/oozie-4.0.0-cdh5.3.6/oozie-apps/subwf_fork/job.properties -run

去webUIhttp://hadoop03:11000/oozie/ 查看,此時可以看到正在run fork文件夾裏面的東西

在這裏插入圖片描述

點擊一下如圖所示的刷新按鈕,可以看到兩個子workflow也開始run了起來

在這裏插入圖片描述

點擊Done Jobs,可以看到任務執行成功~~

在這裏插入圖片描述

可以驗證一下

hdfs dfs -cat /user/hadoop/date.log
hdfs dfs -cat /user/hadoop/oozie.log

在這裏插入圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章