事情的起因是我在看下面一段代碼遇到的疑惑,明明是while True,爲什麼代碼沒有死循環??
class D(threading.Thread):
def __init__(self, queue):
threading.Thread.__init__(self)
self.queue = queue
def run(self):
while True:
url = self.queue.get()
self.download_file(url)
self.queue.task_done()
def download_file(self, url):
h = urllib2.urlopen(url)
f = os.path.basename(url)+'.html'
with open(f,'wb') as f:
while True:
c = h.read(1024)
if not c:
break
f.write(c)
if __name__ == "__main__":
urls= ['http://www.baidu.com','http://www.sina.com']
queue = Queue.Queue()
for i in range(5):
t = D(queue)
t.setDaemon(True)
t.start()
for u in urls:
queue.put(u)
queue.join()
之前一直簡單認爲setDaemon就是設置爲後臺線程而已,沒有進一步去挖掘裏面的含義。
可問題的關鍵就是setDaemon,在底層的thread模塊中,只要主線程結束了,所有的其它線程都會結束,這很明顯,主線程結束python將銷燬運行時環境,主線程肯定會被結束。
threading模塊的線程setDaemon就是爲了解決這個問題的,如果setDaemon(True),那麼和之前一樣,主線程結束,所有子線程都將結束。如果setDaemon(False),主線程將等待該線程結束,等同於你調用線程的join方法。
所以如果將上面的setDaemon註釋和修改爲False,那麼程序將死循環。
其實我們並不推薦上面的做法,上面做法有點線程池的味道,但如果你看過一些python的線程池實現,while True
循環中肯定有檢測退出語句,因爲在python的世界裏言明比隱晦更加pythonic。但很不幸的是,上面的代碼就來
自與<<編寫高質量代碼:改善Python程序的91個建議>>,我並沒有噴這本書,但我覺得代碼舉例的確有待商榷。
你可能好奇,setDaemon(False)是如何等同於線程join的呢?,不急,且聽我慢慢道來。
未解決這個問題,threading模塊引入了_MainThread對象
# Special thread class to represent the main thread
# This is garbage collected through an exit handler
class _MainThread(Thread):
def __init__(self):
Thread.__init__(self, name="MainThread")
self._Thread__started.set()
self._set_ident()
with _active_limbo_lock:
_active[_get_ident()] = self
def _set_daemon(self):
return False
def _exitfunc(self):
self._Thread__stop()
t = _pickSomeNonDaemonThread()
if t:
if __debug__:
self._note("%s: waiting for other threads", self)
while t:
t.join()
t = _pickSomeNonDaemonThread()
if __debug__:
self._note("%s: exiting", self)
self._Thread__delete()
def _pickSomeNonDaemonThread():
for t in enumerate():
if not t.daemon and t.is_alive():
return t
return None
# Create the main thread object,
# and make it available for the interpreter
# (Py_Main) as threading._shutdown.
_shutdown = _MainThread()._exitfunc
其實_MainThread並沒有幹什麼事,唯一的貢獻就是在threading模塊導入時創建了一個實例,並將_exitfunc賦值給_shutdown函數。_exitfunc將收集所有非daemon且alive的線程,並調用線程的join方法。哦,原來是
_MainThread悄悄的在幕後奮鬥着,剩下的問題就是誰調用_shutdown函數的呢?
當python要銷燬運行時之前肯定會調用,所以打開pythonrun.c,你會發現如下函數
/* Wait until threading._shutdown completes, provided
the threading module was imported in the first place.
The shutdown routine will wait until all non-daemon
"threading" threads have completed. */
static void
wait_for_thread_shutdown(void)
{
#ifdef WITH_THREAD
PyObject *result;
PyThreadState *tstate = PyThreadState_GET();
PyObject *threading = PyMapping_GetItemString(tstate->interp->modules,
"threading");
if (threading == NULL) {
/* threading not imported */
PyErr_Clear();
return;
}
result = PyObject_CallMethod(threading, "_shutdown", "");
if (result == NULL)
PyErr_WriteUnraisable(threading);
else
Py_DECREF(result);
Py_DECREF(threading);
#endif
}
原來是這傢伙在搞鬼,漲見識了,原來在C中還有調用py代碼的需求啊。沒辦法啊,誰讓threading模塊是純py代碼呢!!!