python內存泄露

一、python有自動垃圾回收機制(當對象的引用計數爲零時解釋器會自動釋放內存),出現內存泄露的場景一般是擴展庫內存泄露或者循環引用(還有一種是全局容器裏的對象沒有刪除)

前者無需討論,後者舉例如下(Obj('B')和Obj('C')的內存沒有回收)(貌似循環引用的內存,Python解釋器也會自己回收(標記-清除垃圾收集機制),只是時間早晚的問題,也就是說我們在編碼中不需要耗費精力去刻意避免循環引用,具體的內容這兩天再細看一下(http://stackoverflow.com/questions/4484167/details-how-python-garbage-collection-works 源碼剖析的垃圾收集那一章還沒看完真是心病啊)---2013.10.20)

[dongsong@localhost python_study]$ cat leak_test2.py 
#encoding=utf-8

class Obj:
    def __init__(self,name='A'):
        self.name = name
        print '%s inited' % self.name
    def __del__(self):
        print '%s deleted' % self.name

if __name__ == '__main__':
    a = Obj('A')
    b = Obj('B')
    c = Obj('c')

    c.attrObj = b
    b.attrObj = c
[dongsong@localhost python_study]$ vpython leak_test2.py 
A inited
B inited
c inited
A deleted


二、objgraph模塊

該模塊可以找到增長最快的對象、實際最多的對象,可以畫出某對象裏面所有元素的引用關係圖、某對象背後的所有引用關係圖;可以根據地址獲取對象

但是用它來找內存泄露還是有點大海撈針的感覺:需要自己更具增長最快、實際最多對象的日誌來確定可疑對象(一般是list/dict/tuple等common對象,這個很難排查;如果最多最快的是自定義的非常規對象則比較好確定原因)

1.show_refs() show_backrefs() show_most_common_types() show_growth()

[dongsong@localhost python_study]$ !cat
cat objgraph1.py 
#encoding=utf-8
import objgraph

if __name__ == '__main__':
        x = []
        y = [x, [x], dict(x=x)]
        objgraph.show_refs([y], filename='/tmp/sample-graph.png') #把[y]裏面所有對象的引用畫出來
        objgraph.show_backrefs([x], filename='/tmp/sample-backref-graph.png') #把對x對象的引用全部畫出來
        #objgraph.show_most_common_types() #所有常用類型對象的統計,數據量太大,意義不大
        objgraph.show_growth(limit=4) #打印從程序開始或者上次show_growth到現在增加的對象(按照增加量的大小排序)
[dongsong@localhost python_study]$ !vpython
vpython objgraph1.py 
Graph written to /tmp/tmpuSFr9A.dot (5 nodes)
Image generated as /tmp/sample-graph.png
Graph written to /tmp/tmpAn6niV.dot (7 nodes)
Image generated as /tmp/sample-backref-graph.png
tuple                          3393     +3393
wrapper_descriptor              945      +945
function                        830      +830
builtin_function_or_method      622      +622

sample-graph.png

sample-backref-graph.png

2.show_chain()

[dongsong@localhost python_study]$ cat objgraph2.py 
#encoding=utf-8
import objgraph, inspect, random

class MyBigFatObject(object):
        pass

def computate_something(_cache = {}):
        _cache[42] = dict(foo=MyBigFatObject(),bar=MyBigFatObject())
        x = MyBigFatObject()

if __name__ == '__main__':
        objgraph.show_growth(limit=3)
        computate_something()
        objgraph.show_growth(limit=3)
        objgraph.show_chain(
                objgraph.find_backref_chain(random.choice(objgraph.by_type('MyBigFatObject')),
                        inspect.ismodule),
                filename = '/tmp/chain.png')
        #roots = objgraph.get_leaking_objects()
        #print 'len(roots)=%d' % len(roots)
        #objgraph.show_most_common_types(objects = roots)
        #objgraph.show_refs(roots[:3], refcounts=True, filename='/tmp/roots.png')
[dongsong@localhost python_study]$ !vpython
vpython objgraph2.py 
tuple                  3400     +3400
wrapper_descriptor      945      +945
function                831      +831
wrapper_descriptor      956       +11
tuple                  3406        +6
member_descriptor       165        +4
Graph written to /tmp/tmpklkHqC.dot (7 nodes)
Image generated as /tmp/chain.png

chain.png


三、gc模塊

該模塊可以確定垃圾回收期無法引用到(unreachable)和無法釋放(uncollectable)的對象,跟objgraph相比有其獨到之處

gc.collect()強制回收垃圾,返回unreachable object的數量

gc.garbage返回unreachable object中uncollectable object的列表(都是些有__del__()析構函數並且身陷引用循環的對象)IfDEBUG_SAVEALL is set, then all unreachable objects will be added to this list rather than freed.

warning:如果用gc.disable()把自動垃圾回收關掉了,然後又不主動gc.collect(),你會看到內存刷刷的被消耗....

[dongsong@bogon python_study]$ cat gc_test.py 
#encoding=utf-8

import gc

class MyObj:
        def __init__(self, name):
                self.name = name
                print "%s inited" % self.name
        def __del__(self):
                print "%s deleted" % self.name


if __name__ == '__main__':
        gc.disable()
        gc.set_debug(gc.DEBUG_COLLECTABLE | gc.DEBUG_UNCOLLECTABLE | gc.DEBUG_INSTANCES | gc.DEBUG_OBJECTS | gc.DEBUG_SAVEALL)

        a = MyObj('a')
        b = MyObj('b')
        c = MyObj('c')
        a.attr = b
        b.attr = a
        a = None
        b = None
        c = None

        if gc.isenabled():
                print 'automatic collection is enabled'
        else:
                print 'automatic collection is disabled'

        rt = gc.collect()
        print "%d unreachable" % rt

        garbages = gc.garbage
        print "\n%d garbages:" % len(garbages)
        for garbage in garbages:
                if isinstance(garbage, MyObj):
                        print "obj-->%s name-->%s attrrMyObj-->%s" % (garbage, garbage.name, garbage.attr)
                else:
                        print str(garbage)


[dongsong@bogon python_study]$ vpython gc_test.py 
a inited
b inited
c inited
c deleted
automatic collection is disabled
gc: uncollectable <MyObj instance at 0x7f3ebd455b48>
gc: uncollectable <MyObj instance at 0x7f3ebd455b90>
gc: uncollectable <dict 0x261c4b0>
gc: uncollectable <dict 0x261bdf0>
4 unreachable

4 garbages:
obj--><__main__.MyObj instance at 0x7f3ebd455b48> name-->a attrrMyObj--><__main__.MyObj instance at 0x7f3ebd455b90>
obj--><__main__.MyObj instance at 0x7f3ebd455b90> name-->b attrrMyObj--><__main__.MyObj instance at 0x7f3ebd455b48>
{'name': 'a', 'attr': <__main__.MyObj instance at 0x7f3ebd455b90>}
{'name': 'b', 'attr': <__main__.MyObj instance at 0x7f3ebd455b48>}

四、pdb模塊

詳細手冊:http://www.ibm.com/developerworks/cn/linux/l-cn-pythondebugger/

命令和gdb差不錯(只是打印數據的時候不是必須加個p,而且調試界面和操作類似python交互模式)

h(elp) 幫助

c(ontinue)  繼續

n(ext) 下一個語句

s(tep)  下一步(跟進函數內部)

b(reak) 設置斷點

l(ist) 顯示代碼

bt 調用棧

回車 重複上一個命令

....

鳥人喜歡在需要調試的地方加入pdb.set_trace()然後進入狀態....(其他還有好多方式備選)


五、django內存泄露

Why is Django leaking memory?

Django isn't known to leak memory. If you find your Django processes areallocating more and more memory, with no sign of releasing it, check to makesure yourDEBUG setting is set toFalse. IfDEBUGisTrue, then Django saves a copy of every SQL statement it has executed.

(The queries are saved in django.db.connection.queries. SeeHow can I see the raw SQL queries Django is running?.)

To fix the problem, set DEBUG toFalse.

If you need to clear the query list manually at any point in your functions,just callreset_queries(), like this:

from django import db
db.reset_queries()

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章