python的subprocess：子程序調用（調用執行其他命令）；獲取子程序腳本當前路徑問題

python當前進程可以調用子進程，子進程可以執行其他命令，如shell，python，java，c...

而調用子進程方法有

os模塊

參見：http://blog.csdn.net/longshenlmj/article/details/8331526

而提高版是 subprocess模塊,類似os的部分功能，可以說是優化的專項功能類.

python subprocess

用於程序執行時調用子程序，通過stdout,stdin和stderr進行交互。

Stdout子程序執行結果返回，如文件、屏幕等
Stdin 子程序執行時的輸入，如文件，文件對象
Stderr錯誤輸出

常用的兩種方式（以shell程序爲例）：

1，subprocess.Popen('腳本/shell', shell=True)   #無阻塞並行
2，subprocess.call('腳本/shell', shell=True)   #等子程序結束再繼續

兩者的區別是前者無阻塞,會和主程序並行運行,後者必須等待命令執行完畢,如果想要前者編程阻塞加wait()：

p = subprocess.Popen('腳本/shell', shell=True)
a=p.wait() # 返回子進程結果
具體代碼事例：

        hadoop_cmd = "hadoop fs -ls %s"%(hive_tb_path)
        p = subprocess.Popen(hadoop_cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        ret = p.wait() #wait()函數是等待模式的執行子進程，返回執行命令狀態，成功0，失敗1
        print ret #執行成功返回0，失敗返回1。
        #而命令的結果查看通過
        print p.stdout.read()
        #錯誤查看通過
        print p.stderr.read()

調用子進程代碼實例：

方式一

import subprocess
p=subprocess.Popen('./test/dirtest.py',stdout=subprocess.PIPE,shell=True)
print p.stdout.readlines()  
out,err = p.communicate()
print out
print err

##這是一次性交互，讀入是stdin，直接執行完畢後，返回給stdout，communicate通信一次之後即關閉了管道。但如果需要多次交互，頻繁地和子線程通信不能使用communicate()， 可以分步進行通信，如下：

    p= subprocess.Popen(["ls","-l"], stdin=subprocess.PIPE,stdout=subprocess.PIPE,shell=False)  
    //輸入
    p.stdin.write('your command')  
    p.stdin.flush() 
    //查看輸出
    p.stdout.readline() 
    p.stdout.read()

方式二

    ret=subprocess.call('ping -c 1 %s' % ip,shell=True,stdout=open('/dev/null','w'),stderr=subprocess.STDOUT)  
    if ret==0:
        print '%s is alive!' %ip  
    elif ret==1:
        print '%s is down...'%ip

參數shell的意義

    call()和Popen()都有shell參數，默認爲False，可以賦值爲True。
    參數shell（默認爲False）指定是否使用shell來執行程序。如果shell爲True，前面會自動加上/bin/sh命令，則建議傳遞一個字符串（而不是序列）給args，如果爲False就必須傳列表，分開存儲命令內容。比如
    subprocess.Popen("cat test.txt", shell=True)
相當於
    subprocess.Popen(["/bin/sh", "-c", "cat test.txt"])
原因具體是，
    在Linux下，shell=False時, Popen調用os.execvp()執行args指定的程序；
    在Windows下，Popen調用CreateProcess()執行args指定的外部程序，args傳入字符和序列都行，序列會自動list2cmdline()轉化爲字符串，但需要注意的是，並不是MS Windows下所有的程序都可以用list2cmdline來轉化爲命令行字符串。
    所以，windows下
        subprocess.Popen("notepad.exe test.txt" shell=True)
        等同於
        subprocess.Popen("cmd.exe /C "+"notepad.exe test.txt" shell=True）

shell=True可能引起問題

 傳遞shell=True在與不可信任的輸入綁定在一起時可能出現安全問題
警告 執行的shell命令如果來自不可信任的輸入源將使得程序容易受到shell注入攻擊，一個嚴重的安全缺陷可能導致執行任意的命令。因爲這個原因，在命令字符串是從外部輸入的情況下使用shell=True 是強烈不建議的：
    >>> from subprocess import call
    >>> filename = input("What file would you like to display?\n")
    What file would you like to display?
    non_existent; rm -rf / #
    >>> call("cat " + filename, shell=True) # Uh-oh. This will end badly...

shell=False禁用所有基於shell的功能，所以不會受此漏洞影響；參見Popen構造函數文檔中的注意事項以得到如何使shell=False工作的有用提示。
當使用shell=True時，pipes.quote()可以用來正確地轉義字符串中將用來構造shell命令的空白和shell元字符。

幾個介紹subprocess比較詳細的網站：

http://python.usyiyi.cn/python_278/library/subprocess.html（英文https://docs.python.org/2/library/subprocess.html）
http://ipseek.blog.51cto.com/1041109/807513
https://blog.linuxeye.com/375.html
http://blog.csdn.net/imzoer/article/details/8678029

子程序腳本的當前路徑問題

不管用os還是subprocess調用子程序，都會遇到獲取當前路徑的問題。即子程序腳本代碼中想要獲取當前路徑，那麼獲取的路徑是主程序還是子程序的？
Python獲取腳本路徑的方式主要有兩種：
    1）os.path.dirname(os.path.abspath("__file__"))
    2）sys.path[0]
參考http://blog.csdn.net/longshenlmj/article/details/25148935， 
    第一種會獲取主程序的路徑，也就是當前的__file__對象存的是主程序腳本
    第二種才能獲取子程序腳本的路徑

代碼實例：

主程序腳本callpy.py路徑爲/home/wizad/lmj，
調用的子程序腳本dirtest.py路徑爲/home/wizad/lmj/test

[wizad@srv26 lmj]$ cat callpy.py

import subprocess
p = subprocess.Popen('python ./test/dirtest.py',stdout=open('dirtest.txt','w'),shell=True)

[wizad@srv26 test]$ cat dirtest.py

import os
import sys
file_path=os.path.dirname(os.path.abspath("__file__"))
print file_path+"11111"
cur_path = sys.path[0]
print cur_path+"22222"

執行python callpy.py結果輸出：cat dirtest.txt

/home/wizad/lmj11111
/home/wizad/lmj/test22222

輸出結果是放到文件dirtest.txt中，可以看出方式1是主程序路徑，而方式2是子程序路徑。
另外，stdout的輸出方式還可以是PIPE，讀取的方式可以直接打印，
如，
1）

p = subprocess.Popen('python ./test/dirtest.py',stdout=subprocess.PIPE,shell=True)
out,err = p.communicate()
print out
print err

輸出：[wizad@srv26 lmj]$ python callpy.py

/home/wizad/lmj11111
/home/wizad/lmj/test22222

None

2）

p = subprocess.Popen('python ./test/dirtest.py',stdout=subprocess.PIPE,shell=True)
print p.stdout.readlines()  
out,err = p.communicate()
print out
print err

輸出爲

['/home/wizad/lmj11111\n', '/home/wizad/lmj/test22222\n']

None

這兩種讀取方式，是直接通過屏幕輸出結果。

有關subprocess模塊其他知識，引用一些資料如下：

subprocess.Popen(
      args, 
      bufsize=0, 
      executable=None,
      stdin=None,
      stdout=None, 
      stderr=None, 
      preexec_fn=None, 
      close_fds=False, 
      shell=False, 
      cwd=None, 
      env=None, 
      universal_newlines=False, 
      startupinfo=None, 
      creationflags=0)

1)、args可以是字符串或者序列類型（如：list，元組），用於指定進程的可執行文件及其參數。如果是序列類型，第一個元素通常是可執行文件的路徑。我們也可以顯式的使用executeable參數來指定可執行文件的路徑。
2)、bufsize：指定緩衝。0 無緩衝,1 行緩衝,其他緩衝區大小,負值系統緩衝(全緩衝)
3)、stdin, stdout, stderr分別表示程序的標準輸入、輸出、錯誤句柄。他們可以是PIPE，文件描述符或文件對象，也可以設置爲None，表示從父進程繼承。
4)、preexec_fn只在Unix平臺下有效，用於指定一個可執行對象（callable object），它將在子進程運行之前被調用。
5)、Close_sfs：在windows平臺下，如果close_fds被設置爲True，則新創建的子進程將不會繼承父進程的輸入、輸出、錯誤管道。我們不能將close_fds設置爲True同時重定向子進程的標準輸入、輸出與錯誤(stdin, stdout, stderr)。
6)、shell設爲true，程序將通過shell來執行。
7)、cwd用於設置子進程的當前目錄
8)、env是字典類型，用於指定子進程的環境變量。如果env = None，子進程的環境變量將從父進程中繼承。Universal_newlines:不同操作系統下，文本的換行符是不一樣的。如：windows下用’/r/n’表示換，而Linux下用’/n’。如果將此參數設置爲True，Python統一把這些換行符當作’/n’來處理。
9)、startupinfo與createionflags只在windows下有效，它們將被傳遞給底層的CreateProcess()函數，用於設置子進程的一些屬性，如：主窗口的外觀，進程的優先級等等。

Popen方法
1)、Popen.poll()：用於檢查子進程是否已經結束。設置並返回returncode屬性。
2)、Popen.wait()：等待子進程結束。設置並返回returncode屬性。
3)、Popen.communicate(input=None)：與子進程進行交互。向stdin發送數據，或從stdout和stderr中讀取數據。可選參數input指定發送到子進程的參數。Communicate()返回一個元組：(stdoutdata, stderrdata)。注意：如果希望通過進程的stdin向其發送數據，在創建Popen對象的時候，參數stdin必須被設置爲PIPE。同樣，如果希望從stdout和stderr獲取數據，必須將stdout和stderr設置爲PIPE。
4)、Popen.send_signal(signal)：向子進程發送信號。
5)、Popen.terminate()：停止(stop)子進程。在windows平臺下，該方法將調用Windows API TerminateProcess（）來結束子進程。
6)、Popen.kill()：殺死子進程。
7)、Popen.stdin：如果在創建Popen對象是，參數stdin被設置爲PIPE，Popen.stdin將返回一個文件對象用於策子進程發送指令。否則返回None。
8)、Popen.stdout：如果在創建Popen對象是，參數stdout被設置爲PIPE，Popen.stdout將返回一個文件對象用於策子進程發送指令。否則返回None。
9)、Popen.stderr：如果在創建Popen對象是，參數stdout被設置爲PIPE，Popen.stdout將返回一個文件對象用於策子進程發送指令。否則返回None。
10)、Popen.pid：獲取子進程的進程ID。
11)、Popen.returncode：獲取進程的返回值。如果進程還沒有結束，返回None。
12)、subprocess.call(*popenargs, **kwargs)：運行命令。該函數將一直等待到子進程運行結束，並返回進程的returncode。文章一開始的例子就演示了call函數。如果子進程不需要進行交互,就可以使用該函數來創建。
13)、subprocess.check_call(*popenargs, **kwargs)：與subprocess.call(*popenargs, **kwargs)功能一樣，只是如果子進程返回的returncode不爲0的話，將觸發CalledProcessError異常。在異常對象中，包括進程的returncode信息。

死鎖

使用管道時，不去處理管道的輸出，當   子進程輸出了大量數據到stdout或者stderr的管道，並達到了系統pipe的緩存大小的話（操作系統緩存無法獲取更多信息），子進程會等待父進程讀取管道，而父進程此時正wait着的話，將會產生傳說中的死鎖。
可能引起死鎖的調用：
    subprocess.call()
    subprocess.check_call()
    subprocess.check_output()
    Popen.wait()
    可以看出，子進程使用管道交互，如果需要等待子進程完畢，就可能引起死鎖。比如下面的用法：

    p=subprocess.Popen("longprint", shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)  
    p.wait()

longprint是一個假想的有大量輸出的進程，那麼在我的xp, Python2.5的環境下，當輸出達到4096時，死鎖就發生了。

避免subprocess的管道引起死鎖

1）使用Popen()和communicate()方法，可以避免死鎖。沒有等待，會自動清理緩存。
2）如果用p.stdout.readline（或者p.communicate）去清理輸出，那麼無論輸出多少，死鎖都是不會發生的。
3)或者不用管道，比如不做重定向，或者重定向到文件，也可以避免死鎖。

python的subprocess：子程序調用（調用執行其他命令）；獲取子程序腳本當前路徑問題

os模塊

python subprocess

方式一

方式二

參數shell的意義

shell=True可能引起問題

幾個介紹subprocess比較詳細的網站：

子程序腳本的當前路徑問題

代碼實例：

有關subprocess模塊其他知識，引用一些資料如下：

死鎖

避免subprocess的管道引起死鎖

SQL優化-20231016

hive編程指南——讀書筆記（無知拾遺）

sql的簡單提高效率方法

java多線程的編程實例

hive中使用case、if：一個region統計業務（hive條件函數case、if、COALESCE語法介紹:CONDITIONAL FUNCTIONS IN HIVE）

pig腳本不需要後綴名（python tempfile模塊生成pig腳本臨時文件，執行）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結