linux c解決多個第三方so動態庫包含不同版本openssl造成的符號衝突

1.奇異的現象

由於有一個功能(用釘釘羣機器人向釘釘羣發送消息)採用了libcurl庫,所以鏈接了libcurl庫,出現了一個非常奇怪的現象:

編譯正常,運行正常,但是運行到發送https post請求時,整個程序死機,讓libcurl以VERBOSE方式輸出執行信息時,發現停止在ALPN, offering http/1.1這裏不動了,CPU有一個核100%佔用。單獨弄一個項目來測試libcurl庫的功能,一切正常;在應用項目中使用就出現這個情況

* TCP_NODELAY set
* Connected to oapi.dingtalk.com (203.119.206.75) port 443 (#0)
* ALPN, offering http/1.1

中間嘗試了wireshark抓包,也不得要領,死機的程序在TCP連接上對端之後就不再有動作,本該進行的加密通訊沒有進行下去,而是停止了。

總之,現象就是單獨編譯正常的libcurl一旦集成到原先的應用程序中就會出現上面說的死機+cpu滿載的情況。

啓用調試,在程序死機的時候中斷程序,發現每次中斷,調用棧都如下圖

2.探明原因

2.1.分析過程

根據現象分析,應該是libcurl在調用與SSL加密相關的函數時走到了不正常的代碼中;之所以這麼確定是與加密相關的代碼是因爲之前我們應用程序也集成了libcurl而且功能正常,唯一不同之處就是之前那個應用程序沒有連接 https 而是連接 http。

到網上進行了一番檢索,發現了這個文章 Shared Library Symbol Conflicts (on Linux) ,盜用其中一張圖說明我們現在的處境:

應該是應用程序中其它的第三方庫與libcurl中的符號發生了衝突;及有可能就是我們在調試時候中斷時發現的OPENSSL_sk_insert 和 X509_STORE開頭的兩個符號。

2.2 確認,查出哪個庫造成的衝突

根據上面提到的文章中講的,用nm命令查看,發現應用程序中的一個庫包含大量的OPENSSL_開頭的和X509_STORE_開頭的符號,而且這些符號以全局的形式存在,也就是靜態鏈接進來了。後來發現這些符號都是openssl庫的,也就是說應用程序中原先使用的這個第三方庫把openssl庫整個靜態鏈接進去了而且還把所有無關的符號都導出成全局的了。這樣就導致libcurl庫以動態方式調用openssl的時候,調用了第三方庫靜態鏈接的openssl,而這個openssl的版本應該是和當前系統內的openssl動態庫的版本不一致的,從而導致了不可預知的故障。

第三方庫中X509_STORE_開頭的符號

faund@Sirius:/usr/lib$ nm -CDa libthostmduserapi_se_v6.3.15.so | grep X509_STORE_
00000000002f3720 T X509_STORE_add_cert
00000000002f35f0 T X509_STORE_add_crl
00000000002f2c20 T X509_STORE_add_lookup
00000000002dbf20 T X509_STORE_CTX_cleanup
00000000002dc3b0 T X509_STORE_CTX_free
00000000002dad70 T X509_STORE_CTX_get0_cert
00000000002dacf0 T X509_STORE_CTX_get0_chain
00000000002dad10 T X509_STORE_CTX_get0_current_crl
00000000002dad00 T X509_STORE_CTX_get0_current_issuer
00000000002daeb0 T X509_STORE_CTX_get0_param
00000000002dad20 T X509_STORE_CTX_get0_parent_ctx
00000000002dae80 T X509_STORE_CTX_get0_policy_tree
00000000002f2810 T X509_STORE_CTX_get0_store
00000000002dad80 T X509_STORE_CTX_get0_untrusted
00000000002f3310 T X509_STORE_CTX_get1_certs
00000000002db210 T X509_STORE_CTX_get1_chain
00000000002f31a0 T X509_STORE_CTX_get1_crls
00000000002f2fb0 T X509_STORE_CTX_get1_issuer
00000000002f2ed0 T X509_STORE_CTX_get_by_subject
00000000002dae30 T X509_STORE_CTX_get_cert_crl
00000000002dae20 T X509_STORE_CTX_get_check_crl
00000000002dadf0 T X509_STORE_CTX_get_check_issued
00000000002dae40 T X509_STORE_CTX_get_check_policy
00000000002dae00 T X509_STORE_CTX_get_check_revocation
00000000002dae70 T X509_STORE_CTX_get_cleanup
00000000002dacd0 T X509_STORE_CTX_get_current_cert
00000000002dac90 T X509_STORE_CTX_get_error
00000000002dacb0 T X509_STORE_CTX_get_error_depth
00000000002db230 T X509_STORE_CTX_get_ex_data
00000000002dae90 T X509_STORE_CTX_get_explicit_policy
00000000002dae10 T X509_STORE_CTX_get_get_crl
00000000002dade0 T X509_STORE_CTX_get_get_issuer
00000000002dae50 T X509_STORE_CTX_get_lookup_certs
00000000002dae60 T X509_STORE_CTX_get_lookup_crls
00000000002daea0 T X509_STORE_CTX_get_num_untrusted
00000000002f3490 T X509_STORE_CTX_get_obj_by_subject
00000000002dadd0 T X509_STORE_CTX_get_verify
00000000002dadb0 T X509_STORE_CTX_get_verify_cb
00000000002dbfc0 T X509_STORE_CTX_init
00000000002db060 T X509_STORE_CTX_new
00000000002db0b0 T X509_STORE_CTX_purpose_inherit
00000000002dad40 T X509_STORE_CTX_set0_crls
00000000002daec0 T X509_STORE_CTX_set0_dane
00000000002daed0 T X509_STORE_CTX_set0_param
00000000002dad50 T X509_STORE_CTX_set0_trusted_stack
00000000002dad90 T X509_STORE_CTX_set0_untrusted
00000000002dbee0 T X509_STORE_CTX_set0_verified_chain
00000000002dad30 T X509_STORE_CTX_set_cert
00000000002dace0 T X509_STORE_CTX_set_current_cert
00000000002daf00 T X509_STORE_CTX_set_default
00000000002daf50 T X509_STORE_CTX_set_depth
00000000002daca0 T X509_STORE_CTX_set_error
00000000002dacc0 T X509_STORE_CTX_set_error_depth
00000000002db240 T X509_STORE_CTX_set_ex_data
00000000002daf40 T X509_STORE_CTX_set_flags
00000000002db200 T X509_STORE_CTX_set_purpose
00000000002daf30 T X509_STORE_CTX_set_time
00000000002db1f0 T X509_STORE_CTX_set_trust
00000000002dadc0 T X509_STORE_CTX_set_verify
00000000002dada0 T X509_STORE_CTX_set_verify_cb
00000000002f2ca0 T X509_STORE_free
00000000002f2670 T X509_STORE_get0_objects
00000000002f2680 T X509_STORE_get0_param
00000000002f2780 T X509_STORE_get_cert_crl
00000000002f2760 T X509_STORE_get_check_crl
00000000002f2700 T X509_STORE_get_check_issued
00000000002f27a0 T X509_STORE_get_check_policy
00000000002f2720 T X509_STORE_get_check_revocation
00000000002f2800 T X509_STORE_get_cleanup
00000000002f2820 T X509_STORE_get_ex_data
00000000002f2740 T X509_STORE_get_get_crl
00000000002f26e0 T X509_STORE_get_get_issuer
00000000002f27c0 T X509_STORE_get_lookup_certs
00000000002f27e0 T X509_STORE_get_lookup_crls
00000000002f26a0 T X509_STORE_get_verify
00000000002f26c0 T X509_STORE_get_verify_cb
00000000002f28a0 T X509_STORE_lock
00000000002f2b40 T X509_STORE_new
00000000002f2840 T X509_STORE_set1_param
00000000002f2770 T X509_STORE_set_cert_crl
00000000002f2750 T X509_STORE_set_check_crl
00000000002f26f0 T X509_STORE_set_check_issued
00000000002f2790 T X509_STORE_set_check_policy
00000000002f2710 T X509_STORE_set_check_revocation
00000000002f27f0 T X509_STORE_set_cleanup
00000000002f2870 T X509_STORE_set_depth
00000000002f2830 T X509_STORE_set_ex_data
00000000002f2890 T X509_STORE_set_flags
00000000002f2730 T X509_STORE_set_get_crl
00000000002f26d0 T X509_STORE_set_get_issuer
00000000002f27b0 T X509_STORE_set_lookup_certs
00000000002f27d0 T X509_STORE_set_lookup_crls
00000000002f2860 T X509_STORE_set_purpose
00000000002f2850 T X509_STORE_set_trust
00000000002f2690 T X509_STORE_set_verify
00000000002f26b0 T X509_STORE_set_verify_cb
00000000002f28b0 T X509_STORE_unlock
00000000002f2b00 T X509_STORE_up_ref

libcurl中X509_STORE_開頭的的符號

faund@Sirius:/usr/lib/x86_64-linux-gnu$ nm -CDa libcurl.so | grep X509_STORE_
                 U X509_STORE_add_lookup
                 U X509_STORE_set_flags

 第三方庫中OPENSSL_開頭的符號

faund@Sirius:/usr/lib$ nm -CDa libthostmduserapi_se_v6.3.15.so | grep OPENSSL_
0000000000271490 T OPENSSL_asc2uni
00000000001d44e0 T OPENSSL_atexit
00000000001db850 T OPENSSL_atomic_add
00000000001d70c0 T OPENSSL_buf2hexstr
00000000001dba10 T OPENSSL_cleanse
00000000001d4b90 T OPENSSL_cleanup
00000000001e9ec0 T OPENSSL_config
00000000001e9fb0 T OPENSSL_die
00000000002d4c90 T OPENSSL_gmtime
00000000002d4ac0 T OPENSSL_gmtime_adj
00000000002d49f0 T OPENSSL_gmtime_diff
00000000001d6ff0 T OPENSSL_hexchar2int
00000000001d7240 T OPENSSL_hexstr2buf
00000000001db880 T OPENSSL_ia32_cpuid
00000000001dbbd0 T OPENSSL_ia32_rdrand
00000000001dbbf0 T OPENSSL_ia32_rdrand_bytes
00000000001dbc50 T OPENSSL_ia32_rdseed
00000000001dbc70 T OPENSSL_ia32_rdseed_bytes
00000000001d4520 T OPENSSL_init_crypto
0000000000244d80 T OPENSSL_INIT_free
0000000000244df0 T OPENSSL_INIT_new
0000000000244da0 T OPENSSL_INIT_set_config_appname
00000000001dbb10 T OPENSSL_instrument_bus
00000000001dbb60 T OPENSSL_instrument_bus2
00000000001e9ef0 T OPENSSL_isservice
00000000001d4f50 T OPENSSL_LH_delete
00000000001d51f0 T OPENSSL_LH_doall
00000000001d4da0 T OPENSSL_LH_doall_arg
00000000001d4ec0 T OPENSSL_LH_error
00000000001d4ed0 T OPENSSL_LH_free
00000000001d4ea0 T OPENSSL_LH_get_down_load
00000000001d5250 T OPENSSL_LH_insert
00000000001d5120 T OPENSSL_LH_new
00000000001d4e90 T OPENSSL_LH_num_items
00000000001d5480 T OPENSSL_LH_retrieve
00000000001d4eb0 T OPENSSL_LH_set_down_load
00000000001d4e10 T OPENSSL_LH_strhash
00000000002455b0 T OPENSSL_load_builtin_modules
00000000001d6f70 T OPENSSL_memcmp
0000000000867cfc B OPENSSL_NONPIC_relocated
00000000001db870 T OPENSSL_rdtsc
00000000001e9f00 T OPENSSL_showfatal
00000000002295e0 T OPENSSL_sk_deep_copy
0000000000229330 T OPENSSL_sk_delete
00000000002293f0 T OPENSSL_sk_delete_ptr
0000000000229760 T OPENSSL_sk_dup
0000000000229820 T OPENSSL_sk_find
0000000000229290 T OPENSSL_sk_find_ex
00000000002291c0 T OPENSSL_sk_free
0000000000229430 T OPENSSL_sk_insert
0000000000229170 T OPENSSL_sk_is_sorted
0000000000229550 T OPENSSL_sk_new
00000000002295d0 T OPENSSL_sk_new_null
0000000000229100 T OPENSSL_sk_num
00000000002293b0 T OPENSSL_sk_pop
0000000000229200 T OPENSSL_sk_pop_free
0000000000229540 T OPENSSL_sk_push
0000000000229140 T OPENSSL_sk_set
00000000002290e0 T OPENSSL_sk_set_cmp_func
00000000002293d0 T OPENSSL_sk_shift
0000000000229180 T OPENSSL_sk_sort
0000000000229530 T OPENSSL_sk_unshift
0000000000229110 T OPENSSL_sk_value
0000000000229260 T OPENSSL_sk_zero
00000000001d7200 T OPENSSL_strlcat
00000000001d71a0 T OPENSSL_strlcpy
00000000001d6fc0 T OPENSSL_strnlen
00000000001d4b20 T OPENSSL_thread_stop
00000000002713e0 T OPENSSL_uni2asc
0000000000271710 T OPENSSL_uni2utf8
0000000000271540 T OPENSSL_utf82uni
00000000001dbaa0 T OPENSSL_wipe_cpu

libcurl中OPENSSL_開頭的符號 

faund@Sirius:/usr/lib/x86_64-linux-gnu$ nm -CDa libcurl.so | grep OPENSSL_
0000000000000000 A CURL_OPENSSL_4
                 U OPENSSL_load_builtin_modules
                 U OPENSSL_sk_num
                 U OPENSSL_sk_pop
                 U OPENSSL_sk_pop_free
                 U OPENSSL_sk_value

3.衆說紛紜的解決辦法

既然已經找出問題原因了,那就找找看怎麼辦。

由於我們沒有第三方庫的源碼,無法用上面提到的文章中的重新編譯共享庫以只導出必要符號的辦法(用-fvisibility=hidden選項編譯,源碼中函數前添加__attribute__ ((visibility ("default")))這種方法),所以要另想辦法。

找了許多說法:

Static and shared library symbol conflicts?

這裏面提到修改link選項以便於只導出需要的符號,進一步提到選項可以參考gnu手冊,但是我們無法控制那個濫用符號導出的第三方庫,而libcurl雖然可以重新編譯,但是它已經非常剋制,不但只導出了幾個必須的符號,而且是以動態鏈接方式鏈接到系統當前的openssl動態庫去的,所以這個選項對我們是沒有用的。

 

Linux下包含相同符號表的兩個庫的衝突問題(鬱悶)

linux 下同名符號衝突問題解決方法 

這兩個是csdn上找到的,說的方法是在編譯so庫時加上鍊接選項: -Wl,-Bsymbolic,--version-script,version,用 version 文件中的腳本指定其導出哪些函數。由於我們沒有第三方庫的源碼,和上一條方法一樣的原因,對我們沒有用。相似的說法還有 Linking two shared libraries with some of the same symbols 這個裏面除了提到上面的選項,還一併提到__attribute__ ((visibility ("default")))這種方法

 

What should I do if two libraries provide a function with the same name generating a conflict?

mouviciel的回答中提到:可以用dlopen(), dlsym(), dlclose()動態地分別地加載兩個衝突的動態庫,用完一個馬上用dlclose關掉;但是我們的應用程序中那個第三方庫是必須一直加載着的,所以這個辦法也行不通。

 

How can I link with (or work around) two third-party static libraries that define the same symbols?

這個人是非常有毅力,他逐個修改openssl庫中衝突的符號,弄到做夢都夢見在改openssl源碼。

 

符號衝突問題解決

這裏提到了用symbol rename的辦法,用objcopy --redefine-sym命令把不可控的第三方庫中的符號給改個名字,可惜,objcopy的redefine-sym選項對so文件無效,這個辦法也沒有用。

 

linking 2 conflicting versions of a libraries

這裏面講用dlopen(..., RTLD_LOCAL);的辦法可以讓第三方庫正常,還提到可以把openssl等用到的庫靜態鏈接並且隱藏掉符號的辦法來編譯libcurl,但是我嘗試了RTLD_LOCAL這個方法,編譯一切正常,運行時得到一個錯誤:invalid mode for dlopen()。事實上這裏已經非常接近問題的解決了,可惜我不知道爲什麼他給出的答案我不能用。

 

How can I remove a symbol from a shared object?

這裏面動了用objcopy把不需要的符號刪除掉的念頭,我試了,執行 objcopy -N 倒是沒有報錯也沒有別的輸出,我還高興了一下,因爲UNIX一向推崇“沒有消息就是好消息”,但是當我再一次用nm到檢查符號時,發現那個要刪除的符號依然在那裏。

 

4.突然的解決

我準備改弦易張了,大不了把需要libcurl庫的功能獨立出來另外再弄一個程序,再和原先的應用程序進行通信,這是最後的退路了。或者還有很多失敗的嘗試記錄中記載的代替方法。而正在此時,胡亂地翻看到了這一篇文章:

Dynamic loading of shared library with RTLD_DEEPBIND 

當我一dlopen中加了參數RTLD_LAZY | RTLD_LOCAL | RTLD_DEEPBIND之後,程序功能正常了,正常了!

這個困擾我幾天的問題,就這樣戲劇性地解決掉了。

When we are supposed to use RTLD_DEEPBIND? 說明了原因:

You should use RTLD_DEEPBIND when you want to ensure that symbols looked up in the loaded library start within the library, and its dependencies before looking up the symbol in the global namespace.

This allows you to have the same named symbol being used in the library as might be available in the global namespace because of another library carrying the same definition; which may be wrong, or cause problems.

 當希望dlopen載入的庫首先從自己和它的依賴庫中查找符號,然後再去全局符號中去查找時,就用RTLD_DEEPBIND。這樣就允許dlopen載入的庫中存在與全局符號重名的符號,而對這個載入的庫來說錯的或者可能引起問題的全局符號可能是由其它庫引入的。

現在,libcurl用dlopen方式運行時動態加載,加載時使用RTLD_DEEPBIND參數,這樣,它就會首先從libcurl.so以及它所依賴的其它庫中查找符號,從而避免了使用有問題的全局符號。

相關的程序片段

//用cURLpp發送RESTful post 請求
int DingDing::post_request_curl(const std::string& url, const std::string& jsonBody, std::string& strReturn)
{
    void *handle;
    static CURLcode (*f_global_init)(long) = NULL;
    static CURL *(*f_easy_init)(void) = NULL;
    static struct curl_slist *(*f_slist_append)(struct curl_slist *, const char *) = NULL;
    static CURLcode (*f_easy_setopt)(CURL *, CURLoption, ...) = NULL;
    static CURLcode (*f_easy_perform)(CURL *) = NULL;
    static void (*f_easy_cleanup)(CURL *) = NULL;
    static void (*f_global_cleanup)(void) = NULL;
    char *error;
    //handle = dlopen ("libcurl.so", RTLD_LAZY | RTLD_LOCAL | RTLD_DEEPBIND);
    handle = dlopen ("libcurl.so", RTLD_LAZY | RTLD_DEEPBIND);
    //handle = dlopen ("libcurl.so", RTLD_LAZY | RTLD_LOCAL);
    if (!handle) {
        LOGINFO << "載入libcurl庫出錯,錯誤信息:" << dlerror();
        exit(1);
    }
    dlerror();    /* Clear any existing error */
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "載入libcurl庫出錯,錯誤信息:" << dlerror();
        exit(1);
    }
    f_global_init = (CURLcode (*)(long)) dlsym(handle, "curl_global_init");
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "載入libcurl庫出錯,錯誤信息:" << dlerror();
        exit(1);
    }
    f_easy_init = (CURL *(*)(void)) dlsym(handle, "curl_easy_init");
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "載入libcurl庫出錯,錯誤信息:" << dlerror();
        exit(1);
    }
    f_slist_append = (struct curl_slist *(*)(struct curl_slist *, const char *))
                     dlsym(handle, "curl_slist_append");
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "載入libcurl庫出錯,錯誤信息:" << dlerror();
        exit(1);
    }
    f_easy_setopt = (CURLcode (*)(CURL *, CURLoption, ...)) dlsym(handle, "curl_easy_setopt");
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "載入libcurl庫出錯,錯誤信息:" << dlerror();
        exit(1);
    }
    f_easy_perform = (CURLcode (*)(CURL *)) dlsym(handle, "curl_easy_perform");
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "載入libcurl庫出錯,錯誤信息:" << dlerror();
        exit(1);
    }
    f_easy_cleanup = (void (*)(CURL *)) dlsym(handle, "curl_easy_cleanup");
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "載入libcurl庫出錯,錯誤信息:" << dlerror();
        exit(1);
    }
    f_global_cleanup = (void (*)(void)) dlsym(handle, "curl_global_cleanup");
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "載入libcurl庫出錯,錯誤信息:" << dlerror();
        exit(1);
    }

    CURL *ch;
    CURLcode rv;

    f_global_init(CURL_GLOBAL_ALL);
    ch = f_easy_init();

    struct curl_slist *chunk = NULL;
    chunk = f_slist_append(chunk, "Content-Type: application/json");

    f_easy_setopt(ch, CURLOPT_VERBOSE, 1L);
    f_easy_setopt(ch, CURLOPT_HTTPHEADER, chunk);
    //curl_easy_setopt(ch, CURLOPT_HEADER, 0L);
    f_easy_setopt(ch, CURLOPT_POSTFIELDS, jsonBody.c_str());
    //curl_easy_setopt(ch, CURLOPT_NOPROGRESS, 1L);
    //curl_easy_setopt(ch, CURLOPT_NOSIGNAL, 1L);
    //curl_easy_setopt(ch, CURLOPT_WRITEFUNCTION, *writefunction);
    //curl_easy_setopt(ch, CURLOPT_WRITEDATA, stdout);
    //curl_easy_setopt(ch, CURLOPT_HEADERFUNCTION, *writefunction);
    //curl_easy_setopt(ch, CURLOPT_HEADERDATA, stderr);
    //curl_easy_setopt(ch, CURLOPT_SSLCERTTYPE, "PEM");
    f_easy_setopt(ch, CURLOPT_SSL_VERIFYPEER, 0L);
    f_easy_setopt(ch, CURLOPT_URL, url.c_str());

    /* Turn off the default CA locations, otherwise libcurl will load CA
     * certificates from the locations that were detected/specified at
     * build-time
     */
    //curl_easy_setopt(ch, CURLOPT_CAINFO, NULL);
    //curl_easy_setopt(ch, CURLOPT_CAPATH, NULL);

    /* first try: retrieve page without ca certificates -> should fail
     * unless libcurl was built --with-ca-fallback enabled at build-time
     */
    rv = f_easy_perform(ch);
    if(rv == CURLE_OK)
        LOGINFO << "libcurl請求發送成功";
    else
        LOGINFO << "libcurl請求發送失敗";

    f_easy_cleanup(ch);
    f_global_cleanup();
    return rv;
}

失敗的嘗試記錄

1. 鏈接到libcurl.a,但是這樣做libcurl仍舊會用到動態openssl庫,程序仍然會調用到錯誤版本的openssl,仍舊會死機。

2. 嘗試把openssl靜態鏈接到libcurl的so中,由於libcurl沒有這個編譯選項,也參考了很多其它人的說法,其中有很多trick,沒有嘗試下去(主要是在試這個的過程中“突然的解決”發生了:)),如果有人試過這招,請告訴我靈不靈。

3. 想到去找一個不使用openssl的http庫來代替libcurl(curl很厚道,它把所有競爭者都列了個清單),難於上青天,openssl已經基本是事實上一統天下了,beast這些有https功能的requirement清單中都會有openssl。

延伸閱讀

https://cseweb.ucsd.edu/~gbournou/CSE131/the_inside_story_on_shared_libraries_and_dynamic_loading.pdf

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章