Python的descriptor

如果你和我一樣，曾經對method和function以及對它們的各種訪問方式包括self參數的隱含傳遞迷惑不解，建議你耐心的看下去。這裏還提到了Python屬性查找策略，使你清楚的知道Python處理obj.attr和obj.attr=val時，到底做了哪些工作。

Python中，對象的方法也是也可以認爲是屬性，所以下面所說的屬性包含方法在內。

先定義下面這個類，還定義了它的一個實例，留着後面用。

class T(object):
    name = 'name'
    def hello(self):
        print 'hello'
t = T()

使用dir(t)列出t的所有有效屬性：

>>> dir(t)
['__class__', '__delattr__', '__dict__', '__doc__', '__getattribute__',
 '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__',
 '__repr__', '__setattr__', '__str__', '__weakref__', 'hello', 'name']

屬性可以分爲兩類，一類是Python自動產生的，如__class__，__hash__等，另一類是我們自定義的，如上面的hello，name。我們只關心自定義屬性。
類和實例對象(實際上，Python中一切都是對象，類是type的實例)都有__dict__屬性，裏面存放它們的自定義屬性(對與類，裏面還存放了別的東西)。

>>> t.__dict__
{}
>>> T.__dict__
<dictproxy object at 0x00CD0FF0>
>>> dict(T.__dict__)            #由於T.__dict__並沒有直接返回dict對象，這裏進行轉換，以方便觀察其中的內容
{'__module__': '__main__', 'name': 'name',
 'hello': <function hello at 0x00CC2470>,
 '__dict__': <attribute '__dict__' of 'T' objects>,
 '__weakref__': <attribute '__weakref__' of 'T' objects>, '__doc__': None}
>>>

有些內建類型，如list和string，它們沒有__dict__屬性，隨意沒辦法在它們上面附加自定義屬性。

到現在爲止t.__dict__是一個空的字典，因爲我們並沒有在t上自定義任何屬性，它的有效屬性hello和name都是從T得到的。T的__dict__中包含hello和name。當遇到t.name語句時，Python怎麼找到t的name屬性呢？

首先，Python判斷name屬性是否是個自動產生的屬性，如果是自動產生的屬性，就按特別的方法找到這個屬性，當然，這裏的name不是自動產生的屬性，而是我們自己定義的，Python於是到t的__dict__中尋找。還是沒找到。

接着，Python找到了t所屬的類T，搜索T.__dict__，期望找到name，很幸運，直接找到了，於是返回name的值：字符串‘name’。如果在T.__dict__中還沒有找到，Python會接着到T的父類(如果T有父類的話)的__dict__中繼續查找。

這不足以解決我們的困惑，因爲事情遠沒有這麼簡單，上面說的其實是個簡化的步驟。

繼續上面的例子，對於name屬性T.name和T.__dict__['name']是完全一樣的。

>>> T.name
'name'
>>> T.__dict__['name']
'name'
>>>

但是對於hello，情形就有些不同了

>>> T.hello
<unbound method T.hello>
>>> T.__dict__['hello']
<function hello at 0x00CC2470>
>>>

可以發現，T.hello是個unbound method。而T.__dict__['hello']是個函數(不是方法)。

推斷：方法在類的__dict__中是以函數的形式存在的(方法的定義和函數的定義簡直一樣，除了要把第一個參數設爲self)。那麼T.hello得到的應該也是個函數啊，怎麼成了unbound method了。

再看看從實例t中訪問hello

>>> t.hello
<bound method T.hello of <__main__.T object at 0x00CD0E50>>
>>>

是一個bound method。

有意思，按照上面的查找策略，既然在T的__dict__中hello是個函數，那麼T.hello和t.hello應該都是同一個函數纔對。到底是怎麼變成方法的，而且還分爲unbound method和bound method。

關於unbound和bound到還好理解，我們不妨先作如下設想：方法是要從實例調用的嘛(指實例方法，classmethod和staticmethod後面講)，如果從類中訪問，如T.hello，hello沒有和任何實例發生聯繫，也就是沒綁定(unbound)到任何實例上，所以是個unbound，對t.hello的訪問方式，hello和t發生了聯繫，因此是bound。

但從函數<function hello at 0x00CC2470>到方法<unbound method T.hello>的確讓人費解。

一切的魔法都源自今天的主角：descriptor

查找屬性時，如obj.attr，如果Python發現這個屬性attr有個__get__方法，Python會調用attr的__get__方法，返回__get__方法的返回值，而不是返回attr(這一句話並不準確，我只是希望你能對descriptor有個初步的概念)。

Python中iterator(怎麼扯到Iterator了？)是實現了iterator協議的對象，也就是說它實現了下面兩個方法__iter__和next()。類似的，descriptor也是實現了某些特定方法的對象。descriptor的特定方法是__get__,__set__和__delete__，其中__set__和__delete__方法是可選的。iterator必須依附某個對象而存在(由對象的__iter__方法返回)，descriptor也必須依附對象，作爲對象的一個屬性，它而不能單獨存在。還有一點，descriptor必須存在於類的__dict__中，這句話的意思是只有在類的__dict__中找到屬性，Python纔會去看看它有沒有__get__等方法，對一個在實例的__dict__中找到的屬性，Python根本不理會它有沒有__get__等方法，直接返回屬性本身。descriptor到底是什麼呢：簡單的說，descriptor是對象的一個屬性，只不過它存在於類的__dict__中並且有特殊方法__get__(可能還有__set__和__delete)而具有一點特別的功能，爲了方便指代這樣的屬性，我們給它起了個名字叫descriptor屬性。

可能你還是不明白，下面開始用例子說明。

先定義這個類：

class Descriptor(object):
	def __get__(self, obj, type=None):
	        return 'get', self, obj, type
	def __set__(self, obj, val):
		print 'set', self, obj, val
	def __delete__(self, obj):
		print 'delete', self, obj

這裏__set__和__delete__其實可以不出現，不過爲了後面的說明，暫時把它們全寫上。

下面解釋一下三個方法的參數：

self當然不用說，指的是當前Descriptor的實例。obj值擁有屬性的對象。這應該不難理解，前面已經說了，descriptor是對象的稍微有點特殊的屬性，這裏的obj就是擁有它的對象，要注意的是，如果是直接用類訪問descriptor(別嫌囉嗦，descriptor是個屬性，直接用類訪問descriptor就是直接用類訪問類的屬性)，obj的值是None。type是obj的類型，剛纔說過，如果直接通過類訪問descriptor，obj是None，此時type就是類本身。

三個方法的意義，假設T是一個類，t是它的一個實例，d是T的一個descriptor屬性(牛什麼啊，不就是有個__get__方法嗎！)，value是一個有效值：

讀取屬性時，如T.d,返回的是d.__get__(None, T)的結果，t.d返回的是d.__get__(t, T)的結果。

設置屬性時，t.d = value，實際上調用d.__set__(t, value)，T.d = value，這是真正的賦值，T.d的值從此變成value。刪除屬性和設置屬性類似。

下面用例子說明，看看Python中執行是怎麼樣的：

重新定義我們的類T和實例t

class T(object):
	d = Descriptor()
t = T()

d是T的類屬性，作爲Descriptor的實例，它有__get__等方法，顯然，d滿足了所有的條件，現在它就是一個descriptor！

>>> t.d         #t.d，返回的實際是d.__get__(t, T)
('get', <__main__.Descriptor object at 0x00CD9450>, <__main__.T object at 0x00CD0E50>, <class '__main__.T'>)
>>> T.d        #T.d，返回的實際是d.__get__(None, T)，所以obj的位置爲None
('get', <__main__.Descriptor object at 0x00CD9450>, None, <class '__main__.T'>)
>>> t.d = 'hello'   #在實例上對descriptor設置值。要注意的是，現在顯示不是返回值，而是__set__方法中
                               print語句輸出的。
set <__main__.Descriptor object at 0x00CD9450> <__main__.T object at 0x00CD0E50> hello
>>> t.d         #可見，調用了Python調用了__set__方法，並沒有改變t.d的值
('get', <__main__.Descriptor object at 0x00CD9450>, <__main__.T object at 0x00CD0E50>, <class '__main__.T'>)
>>> T.d = 'hello'   #沒有調用__set__方法
>>> T.d                #確實改變了T.d的值
'hello'
>>> t.d               #t.d的值也變了，這可以理解，按我們上面說的屬性查找策略，t.d是從T.__dict__中得到的
                              T.__dict__['d']的值是'hello'，t.d當然也是'hello'
'hello'

data descriptor和non-data descriptor

象上面的d，同時具有__get__和__set__方法，這樣的descriptor叫做data descriptor，如果只有__get__方法，則叫做non-data descriptor。容易想到，由於non-data descriptor沒有__set__方法，所以在通過實例對屬性賦值時，例如上面的t.d = 'hello'，不會再調用__set__方法，會直接把t.d的值變成'hello'嗎？口說無憑，實例爲證：

class Descriptor(object):
	def __get__(self, obj, type=None):
	        return 'get', self, obj, type
class T(object):
       d = Descriptor()
t = T()

>>> t.d
('get', <__main__.Descriptor object at 0x00CD9550>, <__main__.T object at 0x00CD9510>, <class '__main__.T'>)
>>> t.d = 'hello'
>>> t.d
'hello'
>>>

在實例上對non-data descriptor賦值隱藏了實例上的non-data descriptor！

是時候坦白真正詳細的屬性查找策略了，對於obj.attr（注意：obj可以是一個類）：

1.如果attr是一個Python自動產生的屬性，找到！(優先級非常高！)

2.查找obj.__class__.__dict__，如果attr存在並且是data descriptor，返回data descriptor的__get__方法的結果，如果沒有繼續在obj.__class__的父類以及祖先類中尋找data descriptor

3.在obj.__dict__中查找，這一步分兩種情況，第一種情況是obj是一個普通實例，找到就直接返回，找不到進行下一步。第二種情況是obj是一個類，依次在obj和它的父類、祖先類的__dict__中查找，如果找到一個descriptor就返回descriptor的__get__方法的結果，否則直接返回attr。如果沒有找到，進行下一步。

4.在obj.__class__.__dict__中查找，如果找到了一個descriptor(插一句：這裏的descriptor一定是non-data descriptor，如果它是data descriptor，第二步就找到它了)descriptor的__get__方法的結果。如果找到一個普通屬性，直接返回屬性值。如果沒找到，進行下一步。

5.很不幸，Python終於受不了。在這一步，它raise AttributeError

利用這個，我們簡單分析一下上面爲什麼要強調descriptor要在類中才行。我們感興趣的查找步驟是2，3，4。第2步和第4步都是在類中查找。對於第3步，如果在普通實例中找到了，直接返回，沒有判斷它有沒有__get__()方法。

對屬性賦值時的查找策略，對於obj.attr = value

1.查找obj.__class__.__dict__，如果attr存在並且是一個data descriptor，調用attr的__set__方法，結束。如果不存在，會繼續到obj.__class__的父類和祖先類中查找，找到 data descriptor則調用其__set__方法。沒找到則進入下一步。

2.直接在obj.__dict__中加入obj.__dict__['attr'] = value

順便分析下爲什麼在實例上對non-data descriptor賦值隱藏了實例上的non-data descriptor。

接上面的non-data descriptor例子

>>> t.__dict__
{'d': 'hello'}

在t的__dict__裏出現了d這個屬性。根據對屬性賦值的查找策略，第1步，確實在t.__class__.__dict__也就是T.__dict__中找到了屬性d，但它是一個non-data descriptor，不滿足data descriptor的要求，進入第2步，直接在t的__dict__屬性中加入了屬性和屬性值。當獲取t.d時，執行查找策略，第2步在T.__dict__中找到了d，但它是non-data descriptor，步滿足要求，進行第3步，在t的__dict__中找到了d，直接返回了它的值'hello'。

前面說了descriptor，這個東西其實和Java的setter，getter有點像。但這個descriptor和上文中我們開始提到的函數方法這些東西有什麼關係呢？

所有的函數都可以是descriptor，因爲它有__get__方法。

>>> def hello():
	pass

>>> dir(hello)
['__call__', '__class__', '__delattr__', '__dict__', '__doc__', '<span style="color: rgb(255, 0, 0);">__get__</span>



', '__getattribute__', 
'__hash__', '__init__', '__module__', '__name__', '__new__', 
'__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__str__', 'func_closure', 
'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']
>>>

注意，函數對象沒有__set__和__del__方法，所以它是個non-data descriptor.

方法其實也是函數，如下：

>>> class T(object):
	def hello(self):
		pass

>>> T.__dict__['hello']
<function hello at 0x00CD7EB0>
>>>

或者，我們可以把方法看成特殊的函數，只是它們存在於類中，獲取函數屬性時，返回的不是函數本身(比如上面的<function hello at 0x00CD7EB0>)，而是返回函數的__get__方法的返回值，接着上面類T的定義：

>>> T.hello   獲取T的hello屬性，根據查找策略，從T的__dict__中找到了，找到的是<function hello at 0x00CD7EB0>，但不會直接返回<function hello at 0x00CD7EB0>，因爲它有__get__方法，所以返回的是調用它的__get__(None, T)的結果：一個unbound方法。


<unbound method T.hello>
>>> f = T.__dict__['hello']   #直接從T的__dict__中獲取hello，不會執行查找策略，直接返回了<function hello at 0x00CD7EB0>


>>> f
<function hello at 0x00CD7EB0>
>>> t = T()                 
>>> t.hello                     #從實例獲取屬性，返回的是調用<function hello at 0x00CD7EB0>的__get__(t, T)的結果：一個bound方法。



<bound method T.hello of <__main__.T object at 0x00CDAD10>>
>>>

爲了證實我們上面的說法，在繼續下面的代碼(f還是上面的<function hello at 0x00CD7EB0>)：

>>> f.__get__(None, T)
<unbound method T.hello>
>>> f.__get__(t, T)
<bound method T.hello of <__main__.T object at 0x00CDAD10>>

好極了！

總結一下：

1.所有的函數都有__get__方法

2.當函數位於類的__dict__中時，這個函數可以認爲是個方法，通過類或實例獲取該函數時，返回的不是函數本身，而是它的__get__方法返回值。

我承認我可能誤導你認爲方法就是函數，是特殊的函數。其實方法和函數還是有區別的，準確的說：方法就是方法，函數就是函數。

>>> type(f)
<type 'function'>
>>> type(t.hello)
<type 'instancemethod'>
>>> type(T.hello)
<type 'instancemethod'>
>>>

函數是function類型的，method是instancemethod(這是普通的實例方法，後面會提到classmethod和staticmethod)。

關於unbound method和bound method，再多說兩句。在c實現中，它們是同一個對象(它們都是instancemethod類型的)，我們先看看它們裏面到底是什麼

>>> dir(t.hello)
['__call__', '__class__', '__cmp__', '__delattr__', '__doc__', '__get__', '__getattribute__', 
'__hash__', '__init__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', 
'__str__', 'im_class', 'im_func', 'im_self']

__call__說明它們是個可調用對象，而且我們還可以猜測，這個__call__的實現應該大致是：轉調另外一個函數(我們期望的哪個，比如上面的hello)，並以對象作爲第一參數。

要注意的是im_class,im_func,im_self。這幾個東西我們並不陌生，在t.hello裏，它們分別代表T,hello(這裏是存儲在T.__dict__裏的函數hello)和t。有了這些我們可以大致想象如何純Python實現一個instancemethod了:)。

其實還有幾個內建函數都和descriptor有關，下面簡單說說。

classmethod

classmethod能將一個函數轉換成類方法，類方法的第一個隱含參數是類本身(普通方法的第一個隱含參數是實例本身)，類方法即可從類調用，也可以從實例調用(普通方法只能從實例調用)。

>>> class T(object):
	def hello(cls):
		print 'hello', cls
	hello = classmethod(hello)   #兩個作用：把hello裝換成類方法，同時隱藏作爲普通方法的hello

	
>>> t = T()
>>> t.hello()
hello <class '__main__.T'>
>>> T.hello()
hello <class '__main__.T'>
>>>

注意：classmethod是個類，不是函數。classmethod類有__get__方法，所以，上面的t.hello和T.hello獲得實際上是classmethod的__get__方法返回值

>>> t.hello
<bound method type.hello of <class '__main__.T'>>
>>> type(t.hello)
<type 'instancemethod'>
>>> T.hello
<bound method type.hello of <class '__main__.T'>>
>>> type(T.hello)
<type 'instancemethod'>
>>>

從上面可以看出，t.hello和T.hello是instancemethod類型的，而且是綁定在T上的。也就是說classmethod的__get__方法返回了一個instancemethod對象。從前面對instancemethod的分析上，我們應該可以推斷：t.hello的im_self是T，im_class是type(T是type的實例)，im_func是函數hello

>>> t.hello.im_self
<class '__main__.T'>
>>> t.hello.im_class
<type 'type'>
>>> t.hello.im_func
<function hello at 0x011A40B0>
>>>

完全一致！所以實現一個純Python的classmethod也不難:)

staticmethod

staticmethod能將一個函數轉換成靜態方法，靜態方法沒有隱含的第一個參數。

class T(object):
	def hello():
		print 'hello'
	hello = staticmethod(hello)

	
>>> T.hello()   #沒有隱含的第一個參數
hello
>>> T.hello
<function hello at 0x011A4270>
>>>

T.hello直接返回了一個函數。猜想staticmethod類的__get__方法應該是直接返回了對象本身。

還有一個property，和上面兩個差不多，它是個data descriptor。

Python的descriptor

C#開源的兩款功能強大的錄屏神器

認知提升的方法

螞蟻面試：Springcloud核心組件的底層原理，你知道多少？

Python學習隨記（持續更新）

Ceph 文件系統安裝

C語言中得struct對齊

python super()

參數估計與統計抽樣

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結