一、python有自動垃圾回收機制(當對象的引用計數爲零時解釋器會自動釋放內存),出現內存泄露的場景一般是擴展庫內存泄露或者循環引用(還有一種是全局容器裏的對象沒有刪除)
前者無需討論,後者舉例如下(Obj('B')和Obj('C')的內存沒有回收)(貌似循環引用的內存,Python解釋器也會自己回收(標記-清除垃圾收集機制),只是時間早晚的問題,也就是說我們在編碼中不需要耗費精力去刻意避免循環引用,具體的內容這兩天再細看一下(http://stackoverflow.com/questions/4484167/details-how-python-garbage-collection-works 源碼剖析的垃圾收集那一章還沒看完真是心病啊)---2013.10.20)
[dongsong@localhost python_study]$ cat leak_test2.py
#encoding=utf-8
class Obj:
def __init__(self,name='A'):
self.name = name
print '%s inited' % self.name
def __del__(self):
print '%s deleted' % self.name
if __name__ == '__main__':
a = Obj('A')
b = Obj('B')
c = Obj('c')
c.attrObj = b
b.attrObj = c
[dongsong@localhost python_study]$ vpython leak_test2.py
A inited
B inited
c inited
A deleted
該模塊可以找到增長最快的對象、實際最多的對象,可以畫出某對象裏面所有元素的引用關係圖、某對象背後的所有引用關係圖;可以根據地址獲取對象
但是用它來找內存泄露還是有點大海撈針的感覺:需要自己更具增長最快、實際最多對象的日誌來確定可疑對象(一般是list/dict/tuple等common對象,這個很難排查;如果最多最快的是自定義的非常規對象則比較好確定原因)
1.show_refs() show_backrefs() show_most_common_types() show_growth()
[dongsong@localhost python_study]$ !cat
cat objgraph1.py
#encoding=utf-8
import objgraph
if __name__ == '__main__':
x = []
y = [x, [x], dict(x=x)]
objgraph.show_refs([y], filename='/tmp/sample-graph.png') #把[y]裏面所有對象的引用畫出來
objgraph.show_backrefs([x], filename='/tmp/sample-backref-graph.png') #把對x對象的引用全部畫出來
#objgraph.show_most_common_types() #所有常用類型對象的統計,數據量太大,意義不大
objgraph.show_growth(limit=4) #打印從程序開始或者上次show_growth到現在增加的對象(按照增加量的大小排序)
[dongsong@localhost python_study]$ !vpython
vpython objgraph1.py
Graph written to /tmp/tmpuSFr9A.dot (5 nodes)
Image generated as /tmp/sample-graph.png
Graph written to /tmp/tmpAn6niV.dot (7 nodes)
Image generated as /tmp/sample-backref-graph.png
tuple 3393 +3393
wrapper_descriptor 945 +945
function 830 +830
builtin_function_or_method 622 +622
sample-graph.png
sample-backref-graph.png
2.show_chain()
[dongsong@localhost python_study]$ cat objgraph2.py
#encoding=utf-8
import objgraph, inspect, random
class MyBigFatObject(object):
pass
def computate_something(_cache = {}):
_cache[42] = dict(foo=MyBigFatObject(),bar=MyBigFatObject())
x = MyBigFatObject()
if __name__ == '__main__':
objgraph.show_growth(limit=3)
computate_something()
objgraph.show_growth(limit=3)
objgraph.show_chain(
objgraph.find_backref_chain(random.choice(objgraph.by_type('MyBigFatObject')),
inspect.ismodule),
filename = '/tmp/chain.png')
#roots = objgraph.get_leaking_objects()
#print 'len(roots)=%d' % len(roots)
#objgraph.show_most_common_types(objects = roots)
#objgraph.show_refs(roots[:3], refcounts=True, filename='/tmp/roots.png')
[dongsong@localhost python_study]$ !vpython
vpython objgraph2.py
tuple 3400 +3400
wrapper_descriptor 945 +945
function 831 +831
wrapper_descriptor 956 +11
tuple 3406 +6
member_descriptor 165 +4
Graph written to /tmp/tmpklkHqC.dot (7 nodes)
Image generated as /tmp/chain.png
chain.png
三、gc模塊
該模塊可以確定垃圾回收期無法引用到(unreachable)和無法釋放(uncollectable)的對象,跟objgraph相比有其獨到之處
gc.collect()強制回收垃圾,返回unreachable object的數量
gc.garbage返回unreachable object中uncollectable object的列表(都是些有__del__()析構函數並且身陷引用循環的對象)IfDEBUG_SAVEALL is set, then all unreachable objects will be added to this list rather than freed.
warning:如果用gc.disable()把自動垃圾回收關掉了,然後又不主動gc.collect(),你會看到內存刷刷的被消耗....
[dongsong@bogon python_study]$ cat gc_test.py
#encoding=utf-8
import gc
class MyObj:
def __init__(self, name):
self.name = name
print "%s inited" % self.name
def __del__(self):
print "%s deleted" % self.name
if __name__ == '__main__':
gc.disable()
gc.set_debug(gc.DEBUG_COLLECTABLE | gc.DEBUG_UNCOLLECTABLE | gc.DEBUG_INSTANCES | gc.DEBUG_OBJECTS | gc.DEBUG_SAVEALL)
a = MyObj('a')
b = MyObj('b')
c = MyObj('c')
a.attr = b
b.attr = a
a = None
b = None
c = None
if gc.isenabled():
print 'automatic collection is enabled'
else:
print 'automatic collection is disabled'
rt = gc.collect()
print "%d unreachable" % rt
garbages = gc.garbage
print "\n%d garbages:" % len(garbages)
for garbage in garbages:
if isinstance(garbage, MyObj):
print "obj-->%s name-->%s attrrMyObj-->%s" % (garbage, garbage.name, garbage.attr)
else:
print str(garbage)
[dongsong@bogon python_study]$ vpython gc_test.py
a inited
b inited
c inited
c deleted
automatic collection is disabled
gc: uncollectable <MyObj instance at 0x7f3ebd455b48>
gc: uncollectable <MyObj instance at 0x7f3ebd455b90>
gc: uncollectable <dict 0x261c4b0>
gc: uncollectable <dict 0x261bdf0>
4 unreachable
4 garbages:
obj--><__main__.MyObj instance at 0x7f3ebd455b48> name-->a attrrMyObj--><__main__.MyObj instance at 0x7f3ebd455b90>
obj--><__main__.MyObj instance at 0x7f3ebd455b90> name-->b attrrMyObj--><__main__.MyObj instance at 0x7f3ebd455b48>
{'name': 'a', 'attr': <__main__.MyObj instance at 0x7f3ebd455b90>}
{'name': 'b', 'attr': <__main__.MyObj instance at 0x7f3ebd455b48>}
四、pdb模塊
詳細手冊:http://www.ibm.com/developerworks/cn/linux/l-cn-pythondebugger/
命令和gdb差不錯(只是打印數據的時候不是必須加個p,而且調試界面和操作類似python交互模式)
h(elp) 幫助
c(ontinue) 繼續
n(ext) 下一個語句
s(tep) 下一步(跟進函數內部)
b(reak) 設置斷點
l(ist) 顯示代碼
bt 調用棧
回車 重複上一個命令
....
鳥人喜歡在需要調試的地方加入pdb.set_trace()然後進入狀態....(其他還有好多方式備選)
五、django內存泄露
Why is Django leaking memory?
Django isn't known to leak memory. If you find your Django processes areallocating more and more memory, with no sign of releasing it, check to makesure yourDEBUG setting is set toFalse. IfDEBUGisTrue, then Django saves a copy of every SQL statement it has executed.
(The queries are saved in django.db.connection.queries. SeeHow can I see the raw SQL queries Django is running?.)
To fix the problem, set DEBUG toFalse.
If you need to clear the query list manually at any point in your functions,just callreset_queries(), like this:
from django import db
db.reset_queries()