原答案地址:http://stackoverflow.com/questions/20527098/how-to-access-local-directory-with-script-executed-in-oozie
Q:
I am running CDH4.5 with virtualbox on my machine. Inside the VM, I have a local directory:
/home/cloudera/logs/abc.log
, and I wanna do a very simple thing, I scheduled a very simple script in oozie to copy theabc.log
to HDFS (/user/cloudera/logs/
).
The scheduled job got executed, but in the stderr, it shows me that:
cannot access /home/cloudera/logs/abc.log: No such file or directory
Is there anyway that could make this work? Because I wanna schedule a script with oozie to copy files from local to HDFS as a bath job daily. Thanks!!
An Oozie shell action is executed on a random Hadoop node, i.e. not locally on the machine where the Oozie server is running.
To implement an action that is executed locally, you could use the SSH action:http://oozie.apache.org/docs/3.3.2/DG_SshActionExtension.html with localhost. See e.g.https://github.com/airawat/OozieSamples/tree/master/oozieProject/workflowSshAction for a nice complete example.
Alternatively, you can start a shell action and execute a script that will SSH to the correct machine.
简单来说,就是oozie执行workflow时,是在随机的一个hadoop节点上,并不是在某一台本地机器,若在bash命令中需要访问到本地文件,可以通过两种方式:
第一种就是这篇答案里面所说的,通过oozie的ssh action,来指定执行这个bash的本地机器
第二种是通过手动加载本地机器的配置文件,在.sh文件内容中添加[. /etc/profile]来加载bash环境变量,也可以达到这个目的。
PS:CDH的oozie执行的时候,使用的yarn用户