Objective-C runtime源碼學習之IMP尋址（不包括消息轉發部分）

作者：代培
地址：http://daipei.me/posts/source_code_learning_of_runtime_imp/
轉載請註明出處
我的博客搬家了，新博客地址：daipei.me

寫在前面

前段時間寫了一篇博客runtime如何通過selector找到對應的IMP地址？（分別考慮類方法和實例方法），這是在看《招聘一個靠譜的iOS》時回答第22題時總結的一篇博客，不過這篇博客中並沒有牽涉到底層的代碼，而且也留下了幾個沒有解決的問題，這篇博客將深入runtime源碼繼續探索這個問題，並嘗試解決上篇博客中未解決的問題，本人第一次閱讀源碼，如果有分析錯誤的地方，歡迎大家糾正。

引入

首先大家都知道，在oc中調用方法（或者說發送一個消息是）runtime底層都會翻譯成objc_msgSend(id self, SEL op, ...)，蘋果爲了優化性能，這個方法是用匯編寫成的

/********************************************************************
 *
 * id objc_msgSend(id self, SEL _cmd,...);
 *
 ********************************************************************/

    ENTRY objc_msgSend
# check whether receiver is nil
teq     a1, #0
    beq     LMsgSendNilReceiver

# save registers and load receiver's class for CacheLookup
stmfd   sp!, {a4,v1}
ldr     v1, [a1, #ISA]

# receiver is non-nil: search the cache
CacheLookup a2, v1, LMsgSendCacheMiss

# cache hit (imp in ip) and CacheLookup returns with nonstret (eq) set, restore registers and call
ldmfd   sp!, {a4,v1}
bx      ip

# cache miss: go search the method lists
LMsgSendCacheMiss:
ldmfd sp!, {a4,v1}
b _objc_msgSend_uncached

LMsgSendNilReceiver:
    mov     a2, #0
    bx      lr

LMsgSendExit:
END_ENTRY objc_msgSend

實話說我沒有學過彙編，所以看到這段代碼我的內心是崩潰的，更可怕的是針對不同的平臺，還有不同彙編代碼的實現

雖然不懂彙編，但是蘋果的註釋很詳細，看註釋也可以大致明白在幹什麼，首先檢查傳入的self是否爲空，然後根據selector尋找方法實現IMP，找到則調用並返回，否則拋出異常。由此可以有以下僞代碼

id objc_msgSend(id self, SEL _cmd, ...) {
  Class class = object_getClass(self);
  IMP imp = class_getMethodImplementation(class, _cmd);
  return imp ? imp(self, _cmd, ...) : 0;
}

僞代碼中我們看到class_getMethodImplementation(Class cls, SEL sel) 方法用來尋找IMP地址，有趣的是蘋果真的提供了這個方法，可以讓我們調用，通過selector去尋找方法實現IMP，而這個函數的實現，以及其延伸就是這篇博客所要探討的重點。

正文

在我前面的文章中也說到IMP尋址總共有兩種方法：

IMP class_getMethodImplementation(Class cls, SEL name);
IMP method_getImplementation(Method m);

而在NSObject中提供了幾個對class_getMethodImplementation封裝的方法

+ (IMP)instanceMethodForSelector:(SEL)sel {
    if (!sel) [self doesNotRecognizeSelector:sel];
    return class_getMethodImplementation(self, sel);
}

+ (IMP)methodForSelector:(SEL)sel {
    if (!sel) [self doesNotRecognizeSelector:sel];
    return object_getMethodImplementation((id)self, sel);
}

- (IMP)methodForSelector:(SEL)sel {
    if (!sel) [self doesNotRecognizeSelector:sel];
    return object_getMethodImplementation(self, sel);
}

但這些方法卻並沒有在頭文件中暴露，所以我並不明白蘋果這樣做的用意，如果有人知道，希望能夠告知，感激不盡！
這裏出現的object_getMethodImplementation其實就是對class_getMethodImplementation的封裝，蘋果的解釋是：

Equivalent to: class_getMethodImplementation(object_getClass(obj), name);

下面我們就暫時把目光轉向class_getMethodImplementation這個函數，看看它底層到底是如何實現的

IMP class_getMethodImplementation(Class cls, SEL sel)
{
    IMP imp;

    if (!cls  ||  !sel) return nil;

    imp = lookUpImpOrNil(cls, sel, nil, 
                         YES/*initialize*/, YES/*cache*/, YES/*resolver*/);

    // Translate forwarding function to C-callable external version
    if (!imp) {
        return _objc_msgForward;
    }

    return imp;
}

首先判斷傳入的參數是否爲空，然後進入lookUpImpOrNil這個方法，實際上這個這個方法是對lookUpImpOrForward的簡單封裝：

/***********************************************************************
* lookUpImpOrNil.
* Like lookUpImpOrForward, but returns nil instead of _objc_msgForward_impcache
**********************************************************************/
IMP lookUpImpOrNil(Class cls, SEL sel, id inst, 
                   bool initialize, bool cache, bool resolver)
{
    IMP imp = lookUpImpOrForward(cls, sel, inst, initialize, cache, resolver);
    if (imp == _objc_msgForward_impcache) return nil;
    else return imp;
}

註釋寫的也很清楚，這個方法不會進行消息的轉發，而直接返回nil，這個倒是比較有趣，明明調用lookUpImpOrForward可以直接進行消息轉發，可是這裏偏不這樣做，調用消息轉發返回nil的函數，然後判斷imp爲nil時，自己手動返回_objc_msgForward，進行消息轉發，還真是有意思，不過蘋果在這裏做了註釋：Translate forwarding function to C-callable external version，將這個轉發函數轉換爲C語言能夠調用的版本。
接下來我們繼續深入，看一下lookUpImpOrForward是如何實現的：

IMP lookUpImpOrForward(Class cls, SEL sel, id inst, 
                       bool initialize, bool cache, bool resolver)
{
    Class curClass;
    IMP methodPC = nil;
    Method meth;
    bool triedResolver = NO;

    methodListLock.assertUnlocked();

    if (cache) {
        methodPC = _cache_getImp(cls, sel);
        if (methodPC) return methodPC;    
    }

    if (cls == _class_getFreedObjectClass())
        return (IMP) _freedHandler;
    }

 retry:
    methodListLock.lock();

    // Ignore GC selectors
    if (ignoreSelector(sel)) {
        methodPC = _cache_addIgnoredEntry(cls, sel);
        goto done;
    }

    // Try this class's cache.
    methodPC = _cache_getImp(cls, sel);
    if (methodPC) goto done;

    // Try this class's method lists.
    meth = _class_getMethodNoSuper_nolock(cls, sel);
    if (meth) {
        log_and_fill_cache(cls, cls, meth, sel);
        methodPC = method_getImplementation(meth);
        goto done;
    }

    // Try superclass caches and method lists.
    curClass = cls;
    while ((curClass = curClass->superclass)) {
        // Superclass cache.
        meth = _cache_getMethod(curClass, sel, _objc_msgForward_impcache);
        if (meth) {
            if (meth != (Method)1) {
                // Found the method in a superclass. Cache it in this class.
                log_and_fill_cache(cls, curClass, meth, sel);
                methodPC = method_getImplementation(meth);
                goto done;
            }
            else {
                // Found a forward:: entry in a superclass.
                // Stop searching, but don't cache yet; call method 
                // resolver for this class first.
                break;
            }
        }

        // Superclass method list.
        meth = _class_getMethodNoSuper_nolock(curClass, sel);
        if (meth) {
            log_and_fill_cache(cls, curClass, meth, sel);
            methodPC = method_getImplementation(meth);
            goto done;
        }
    }

    // No implementation found. Try method resolver once.

    if (resolver  &&  !triedResolver) {
        methodListLock.unlock();
        _class_resolveMethod(cls, sel, inst);
        triedResolver = YES;
        goto retry;
    }

    // No implementation found, and method resolver didn't help. 
    // Use forwarding.

    _cache_addForwardEntry(cls, sel);
    methodPC = _objc_msgForward_impcache;

 done:
    methodListLock.unlock();

    // paranoia: look for ignored selectors with non-ignored implementations
    assert(!(ignoreSelector(sel)  &&  methodPC != (IMP)&_objc_ignored_method));

    return methodPC;
}

我天，好長的一段代碼，我刪了好多註釋，還是很多

首先這裏有一個我並不懂的東西methodListLock.assertUnlocked(); 我看到Objective-C 消息發送與轉發機制原理中對此的解釋是

對 debug 模式下的 assert 進行 unlock，runtimeLock 本質上是對 Darwin 提供的線程讀寫鎖 pthread_rwlock_t 的一層封裝，提供了一些便捷的方法。

需要注意的是在objc-runtime-new.mm中有一段幾乎相同的lookUpImpOrForward的實現，在該實現中，加鎖操作是runtimeLock.read(); 所以這篇上述博客使用的代碼應該是objc-runtime-new.mm的代碼，而我的源碼是來自於objc-class-old.mm 雖然名稱不同，但我想底層應該是一樣的。

很佩服這位博主對底層認識的如此深刻，我們暫時就按照這裏寫的理解，繼續往下看

無鎖的緩存查找（Optimistic cache lookup）

在沒有鎖的狀態下進行緩存搜索，性能會比較好

if (cache) {
        methodPC = _cache_getImp(cls, sel);
        if (methodPC) return methodPC;    
    }

首先如果cache傳入的是YES，則調用cache_getImp在緩存中搜索，當然這裏傳入的是YES（而在objc_msgSend方法裏在這裏進行了優化，objc_msgSend最開始就在緩存中進行了搜索，所以有了一個很有趣的方法_class_lookupMethodAndLoadCache3，這個方法在調用lookUpImpOrForward時傳入cache是NO，避免兩次搜索緩存），而cache_getImp 是用匯編寫的（又是彙編。。。(T＿T)）

    STATIC_ENTRY cache_getImp

    mov r9, r0
    CacheLookup GETIMP      // returns IMP on success

LCacheMiss:
    mov     r0, #0              // return nil if cache miss
    bx  lr

LGetImpExit: 
    END_ENTRY cache_getImp

具體的緩存搜索是在宏CacheLookup 中實現的，具體這裏就不展開了（也展開不了，我還沒看懂(^-^) ）。

釋放檢測

if (cls == _class_getFreedObjectClass())
        return (IMP) _freedHandler;

檢測發送消息的對象是否已經被釋放，如果已經釋放，則返回_freedHandler 的IMP

static void _freedHandler(id obj, SEL sel)
{
    __objc_error (obj, "message %s sent to freed object=%p", 
                  sel_getName(sel), (void*)obj);
}

在該方法中拋出message sent to freed object的錯誤信息（不過我還從來沒有遇到過這樣的錯誤信息）

初始化檢查

if (initialize  &&  !cls->isInitialized()) {
        _class_initialize (_class_getNonMetaClass(cls, inst));
    }

這裏我不是很理解+initialize方法是做什麼的

// +initialize bits are stored on the metaclass only
    bool isInitialized() {
        return getMeta()->info & CLS_INITIALIZED;
    }

但是從isInitialized() 的實現來看初始化的信息保存在元類中，由此推測是元類或者是類對象的初始化工作，而我在上文中提到的博客中是這樣寫的：

如果是第一次用到這個類且 initialize 參數爲 YES（initialize && !cls->isInitialized()），需要進行初始化工作，也就是開闢一個用於讀寫數據的空間。先對 runtimeLock 寫操作加鎖，然後調用 cls 的 initialize 方法。如果 sel == initialize 也沒關係，雖然 initialize 還會被調用一次，但不會起作用啦，因爲 cls->isInitialized() 已經是 YES 啦。

這裏的表述也大致印證了我的猜測，是對類對象或者是元類對象進行初始化的工作，不過我還是有一點不明白：類對象都還沒有初始化，那是如何產生這個類的實例對象呢？然而在別人博客中看到：＋load是在runtime之前就被調用的，＋initialize是在runtime才調用

retry語句標號（在該類的父類中查找）

這裏對方法列表進行了加鎖的操作methodListLock.lock();

The lock is held to make method-lookup + cache-fill atomic with respect to method addition. Otherwise, a category could be added but ignored indefinitely because the cache was re-filled with the old value after the cache flush on behalf of the category.

考慮運行時方法的動態添加，加鎖是爲了使方法搜索和緩存填充成爲原子操作。否則category添加時刷新的緩存可能會因爲舊數據的重新填充而被完全忽略掉。

typedef struct {
    SEL name;     // same layout as struct old_method
    void *unused;
    IMP imp;  // same layout as struct old_method
} cache_entry;

//objective-c 2.0以前
struct old_method {
    SEL method_name;
    char *method_types;
    IMP method_imp;
};

typedef struct old_method *Method;

//objective-c 2.0
struct method_t {
    SEL name;
    const char *types;
    IMP imp;

    struct SortBySELAddress :
        public std::binary_function<const method_t&,
                                    const method_t&, bool>
    {
        bool operator() (const method_t& lhs,
                         const method_t& rhs)
        { return lhs.name < rhs.name; }
    };
};
typedef struct method_t *Method;

檢查selector是否是垃圾回收方法，如果是則填充緩存_cache_fill(cls, (Method)entryp, sel);（這裏entryp的類型是結構體cache_entry，將其強轉爲Method，我們可以看到上面的代碼，OC2.0前，這個cache_entry和method的定義幾乎是相同的，2.0後加入了一個我完全看不懂的東西(T_T)）並讓methodPC指向該方法的實現即entryp->imp（實際上這是一個彙編程序的入口_objc_ignored_method），然後跳轉到done語句標號。否則進行下一步
在本類的緩存中查找，也是使用匯編程序入口_cache_getImp，如果找到，跳轉到done語句標號，否則進行下一步。
在上一步緩存中沒有發現，然後進入本類的方法列表中查找，如果找到了則進行緩存填充，並讓methodPC指向該方法的實現，跳轉到done語句標號，否則進行下一步。
在父類的方法列表和緩存中遞歸查找，首先是查找緩存，又是調用一個彙編的程序入口_cache_getMethod 比較奇怪的是我只在objc-msg-i386.s中發現了這個程序入口，與前面不同的是，這裏傳入了一個_objc_msgForward_impcache 的彙編程序入口作爲緩存中消息轉發的標記，如果發現緩存的方法，則使method_PC指向其實現，跳轉到done語句標號，如果找到了Method，但發現其IMP是一個轉發的彙編程序入口即_objc_msgForward_impcache ，立即跳出循環，但是不立刻緩存，而是call method resolver，即進行第5步。如果緩存中沒發現Method，就在列表中尋找，同樣是找到即跳轉到done，否則進行下一步。
當傳入的resolver爲YES且triedResolver爲NO時（即此步驟只會進入一次，進入後triedResolver會設爲YES），進入method resolver（動態方法解析），首先對methodListLock解鎖，然後調用_class_resolveMethod 發送_class_resolveInstanceMethod 或_class_resolveClassMethod 消息，程序員此時可以動態的給selector添加一個對應的IMP。完成後再回到第1步重新來一遍。這一步消息轉發前最後一次機會。
沒有找到方法的實現，method resolver（動態方法解析）也沒有作用，此時進行消息的轉發，使methodPC指向_objc_msgForward_impcache 彙編程序入口，並進入done。

done語句標號

首先將methodListLock解鎖，然後斷言不會存在一個被忽略的selector其implementation是沒有被忽略的（官方的意思是非要找到這樣一個selector，真是有趣）

paranoia: look for ignored selectors with non-ignored implementations

最後返回這個methodPC。

然後就是消息轉發部分了，其objc_setForwardHandler實現機制不在Objective-C Runtime （libobjc.dylib）中，而是在CoreFoundation（CoreFoundation.framework）中，所以這裏就先不討論了，等我以後研究了那部分以後，再專門寫一篇關於消息轉發的博客。

關於正文開始處所說的第二種方法method_getImplementation()，首先需要調用class_getInstanceMethod() 而在這個方法里加了一個warning

#warning fixme build and search caches

    // Search method lists, try method resolver, etc.
    lookUpImpOrNil(cls, sel, nil, 
                   NO/*initialize*/, NO/*cache*/, YES/*resolver*/);

#warning fixme build and search caches

我這裏調用了lookUpImpOrNil方法，卻沒有使用其返回值，而且標註需要fix and search caches，我猜測可能因爲某種原因，在這裏無法進行緩存查找，而後面return _class_getMethod(cls, sel);本質上就是在方法列表中進行查找，而且也沒有進行消息轉發。
這裏也印證了蘋果對於這個方法的註釋：class_getMethodImplementation may be faster than method_getImplementation(class_getInstanceMethod(cls, name))，因爲第一個方法進行了緩存的查找，如果緩存中能找到，效率會提高很多。

以前的問題

在我上一篇博客runtime如何通過selector找到對應的IMP地址？（分別考慮類方法和實例方法）裏我有一個沒有解決的問題：爲什麼對於無法找到的IMP，class_getMethodImplementation()，method_getImplementation()返回值會不一樣？

IMP method_getImplementation(Method m)
{
    return m ? m->imp : nil;
}

看完源碼，就很清楚了，如果這個method不存在，直接返回nil，而
class_getMethodImplementation()會經歷消息轉發機制，最後返回的是forwardInvocation的結果，而這部分是不開源的，也不知道具體是怎麼返回的，但每次運行確實是會返回的一個固定的地址，我猜測最後這個地址可能和NSInvocation這個對象的內存地址有關，具體那是什麼地址，以後有機會在去尋找答案。

結語

如果我上面的分析或推測有錯誤，歡迎指正，大家一同成長，我在寫這篇博客時參考的博客有：Objective-C 消息發送與轉發機制原理、Objective－C 源碼（二）+load 以及 +initialize這裏將其貼出，感謝這些博客的作者，跟這些博客相比，我的博客寫的真的很菜，畢竟剛開始，相信有一天我也能寫出如此優秀的博客。

這篇博客中使用的runtime源碼版本是objc4-680。

Objective-C runtime源碼學習之IMP尋址（不包括消息轉發部分）

寫在前面

引入

正文

無鎖的緩存查找（Optimistic cache lookup）

釋放檢測

初始化檢查

retry語句標號（在該類的父類中查找）

done語句標號

以前的問題

結語

MySQL 核心模塊揭祕 | 18 期 | 鎖在內存里長什麼樣*

使用perf工具生成火焰圖

大齡程序員思考

響應式界面控件DevExtreme * 更強的數據分析和可視化功能

HttpSecurity 是如何組裝過濾器鏈的

數說海南——近6年海南各市縣人口簡單看

長序列中Transformers的高級注意力機制總結

WebStorm 創建 Vue 項目

ios中的初始化函數

mac終端Login Incorrect問題

iOS 滾動數字控件：DPScrollNumberLabel 實現

如何正確使用@synthesize（在有了自動合成屬性實例變量之後，@synthesize還有哪些使用場景？）

Objective-C runtime源碼學習之IMP尋址（不包括消息轉發部分）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結