原答案地址:http://stackoverflow.com/questions/20527098/how-to-access-local-directory-with-script-executed-in-oozie
Q:
I am running CDH4.5 with virtualbox on my machine. Inside the VM, I have a local directory:
/home/cloudera/logs/abc.log
, and I wanna do a very simple thing, I scheduled a very simple script in oozie to copy theabc.log
to HDFS (/user/cloudera/logs/
).
The scheduled job got executed, but in the stderr, it shows me that:
cannot access /home/cloudera/logs/abc.log: No such file or directory
Is there anyway that could make this work? Because I wanna schedule a script with oozie to copy files from local to HDFS as a bath job daily. Thanks!!
An Oozie shell action is executed on a random Hadoop node, i.e. not locally on the machine where the Oozie server is running.
To implement an action that is executed locally, you could use the SSH action:http://oozie.apache.org/docs/3.3.2/DG_SshActionExtension.html with localhost. See e.g.https://github.com/airawat/OozieSamples/tree/master/oozieProject/workflowSshAction for a nice complete example.
Alternatively, you can start a shell action and execute a script that will SSH to the correct machine.
簡單來說,就是oozie執行workflow時,是在隨機的一個hadoop節點上,並不是在某一臺本地機器,若在bash命令中需要訪問到本地文件,可以通過兩種方式:
第一種就是這篇答案裏面所說的,通過oozie的ssh action,來指定執行這個bash的本地機器
第二種是通過手動加載本地機器的配置文件,在.sh文件內容中添加[. /etc/profile]來加載bash環境變量,也可以達到這個目的。
PS:CDH的oozie執行的時候,使用的yarn用戶