In hadoop streaming, when run a map/reduce job, we may want to get some running parameter to known the statues of job. Many thess parameters, configuration and running parameters of job can be obtained from os.environ in python, i.e., the name of file input split, the job id of mapred tasks
os.environ is the dictionary to store the environment variables, Hadoop
will pass the parameter to each task of map/reduce by setting the
environment variable on each host
In map/reduce step, we can use the function of os.environ.get()
Also, you can pass the parameters of configuration explictly to your script, i.e., -mapper "python map.py -i 1"
hadoop streaming方式下的參數傳遞
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.