一個空格引發的“救火之旅” - 記一次 SOFA RPC 的排查過程

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"背景"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"說明:即使你對 "},{"type":"link","attrs":{"href":"https://help.aliyun.com/product/144917.html","title":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"SOFA RPC"}]},{"type":"text","marks":[{"type":"italic"}],"text":" 的技術不熟悉,也能從這篇文章中體會到排查問題的實用技巧,希望對大家有所啓發。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最近某銀行在測試環境發佈了一套 SOFA RPC 應用,包括 SOFA RPC Service 和 SOFA RPC Reference。但是他們業務在調用 SOFA RPC 服務時出錯了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個問題很詭異。詭異到一線和客戶查了一天沒有查出來。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於客戶的項目時間緊張,所以這個問題升級到我這裏處理。救火之旅就此開始。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"SOFA RPC 原理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在進入正題之前,我們簡單的介紹一下 SOFA RPC 的原理。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e2/e22d386122d9474d6a08676688aee27b.png","alt":"image.png","title":"image.png","style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當 SOFA RPC Service 的應用啓動的時候,他們是通過 ACVIP (公有云裏叫 AntVIP ) 獲取到 SOFA Registry (註冊中心) 的地址的。ACVIP 的地址需要填入 RPC Service 和 RPC Reference 的代碼配置文件裏。RPC Service 會將該應用的 RPC 服務註冊到 SOFA Registry 上。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當引用這個服務的 SOFA RPC Reference 應用啓動時,會從 SOFA Registry 訂閱到相應服務的元數據信息 ( SOFA RPC Service 的 IP,端口等信息)。SOFA Registry 收到訂閱請求後,會將發佈方的元數據實時推送給 SOFA RPC Reference。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當 SOFA RPC Reference 拿到元數據後,就可以從中獲得服務地址,併發起調用。如圖中 Reference 指向 Service。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"問題現象"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"讓我們回顧一下問題的現象。SOFA RPC Reference 應用在嘗試調用 SOFA RPC Service 的時候,報 “Cannot get service address of service XXXXXXXXX” 的錯誤。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/d2/d2549e945de1f4d85b08eea42771d37b.png","alt":"image.png","title":"image.png","style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同時,客戶還提到,同一套代碼在測試環境中調用失敗,但是在開發環境中調用成功。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"代碼中,有 application-test.properties 和 application-dev.properties 配置文件,分別對應着測試環境和開發環境。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"注意:SOFA 的應用支持按照環境自動加載對應的配置文件(測試環境加載-test配置文件,開發環境加載-dev配置文件),對於該功能的詳情,請點擊"},{"type":"link","attrs":{"href":"https://help.aliyun.com/document_detail/133155.html","title":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"這裏"}]},{"type":"text","marks":[{"type":"italic"}],"text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/eb/ebd3479a9ceb85fe46e0365a27cb10db.png","alt":"image.png","title":"image.png","style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"排查過程"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"招式一、根據產品原理縮小問題範圍"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在排查產品問題時,最有效的方式是從產品的原理的角度將問題範圍縮小。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"根據錯誤消息,我們知道 RPC Reference 沒有獲得服務的地址。那麼沒有獲得服務的地址可能的原因有哪些?從上面的原理圖,我們可以看到可能的原因有以下幾種:"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RPC Reference 沒有連接上 SOFA Registry"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RPC Reference 沒有從 ACVIP 處獲得 SOFA Registry 的地址"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RPC Reference 跟 SOFA Registry 之間的 9600 端口異常"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"SOFA Registry 異常"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RPC Service 服務沒有發佈成功"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RPC Service 跟 SOFA Registry 之間的 9600 端口異常"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RPC Service 沒有從 ACVIP 處獲得 SOFA Registry 的地址"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RPC Service 應用啓動異常"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"根據原理,繼續縮小範圍的結果如下。"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RPC Reference 沒有連接上 SOFA Registry。排查結果 - 排除❎。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RPC Reference 沒有從 ACVIP 處獲得 SOFA Registry 的地址。排查結果 - 排除❎。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"使用 \""},{"type":"text","marks":[{"type":"italic"}],"text":"netstat -an | grep 9600"},{"type":"text","text":"\" 發現該 RPC Reference 與 SOFA Registry 之間有 9600 端口的長連接,所以排除該問題。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RPC Reference 跟 SOFA Registry 之間的 9600 端口異常。排查結果 - 排除❎。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"根據 netstat 的輸出,發現長連接狀態爲 "},{"type":"text","marks":[{"type":"italic"}],"text":"Established"},{"type":"text","text":",所以排除該問題。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"SOFA Registry 異常。排查結果 - 排除❎。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"測試環境和開發環境是共用一套 SOFA Registry 的,但是開發環境可以正常工作,所以 SOFA Registry 沒有問題。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RPC Service 服務沒有發佈成功。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RPC Service 跟 SOFA Registry 之間的 9600 端口異常。排查結果 - 異常✅"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"使用 \""},{"type":"text","marks":[{"type":"italic"}],"text":"netstat -an | grep 9600"},{"type":"text","text":"\" 發現該 RPC Service 與 SOFA Registry 之間沒有 9600 端口的長連接!"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RPC Service 沒有從 ACVIP 處獲得 SOFA Registry 的地址。排查結果 - 異常✅"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#404040","name":"user"}}],"text":"經過排查,ACVIP 的日誌文件 (/home/admin/logs/acvip-java-client)沒有生成。這說明 ACVIP 框架沒有加載。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RPC Service 應用啓動異常。排查結果 - 排除❎。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"檢查應用啓動日誌,"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#404040","name":"user"}}],"text":"/home/admin/logs/"},{"type":"text","text":"stderr.log 爲空。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"到了這一步,已經定位到 RPC Service 應用的 ACVIP 框架沒有被加載,所以導致後續流程無法繼續。這時候,我們懷疑是應用代碼的問題。可是從 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#404040","name":"user"}}],"text":"/home/admin/logs/"},{"type":"text","text":"stderr.log 和 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#404040","name":"user"}}],"text":"/home/admin/logs/"},{"type":"text","text":"stdout.log 中,我們沒有發現異常,應用也能夠正常啓動,只是不能加載 ACVIP 框架。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"招式二、引入第三方縮小問題範圍"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"承接上文,排查到這裏就卡住了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"客戶覺得是 SOFA RPC 框架的問題,但是我們排查下來猜測是客戶代碼配置的問題。可是客戶再次強調開發環境和測試環境用的是同一套代碼,開發環境是好的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這時候,爲了讓客戶和我們站在同一個排查方向上(對於這種開發問題,要是客戶和我們不在一個排查方向上,會嚴重影響問題排查效率),相信是代碼的問題,需要引入第三方從側面證明是代碼問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們將自己寫的一個最簡單的 SOFA RPC 的 Demo 發給客戶,然後使用測試環境的配置運行。該 Demo 能夠正常運行。這說明肯定還是客戶代碼的問題。這時候,客戶和我們站在統一戰線。客戶也在幫忙檢查代碼是否還有爲檢查的遺漏點。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可惜,客戶還是沒能看出代碼的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"招式三、對比大法找突破口"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"承接上文,排查過程再次卡住。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"於是,我們再重頭梳理一遍問題,我們知道開發環境和測試環境用的是同一套代碼,只是配置文件不一樣。那麼我們在同一臺 RPC provider 的機器上,部署開發環境的配置是否可以運行成功?這時,我們把 application-dev.properties 文件刪除,只保留 application-test.properties,同時只更改以下三個配置屬性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/d8/d8136855d024a4bfd46ace0bcfc1d5b1.png","alt":"image.png","title":"image.png","style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"更改了配置之後,配置文件的內容是開發環境的對應的值,運行成功。重新修改回測試環境的配置,運行失敗。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於是在同一臺機器上運行,可以排除掉很多異構因素。所以可以開始進行日誌對比大法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們將這兩次測試的所有日誌都收集上來。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"運行成功的應用因爲加載了 ACVIP,所以生成了 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#404040","name":"user"}}],"text":"/home/admin/logs/acvip-java-client 目錄以及文件。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"運行失敗的應用因爲沒有加載 ACVIP,所以沒有生成 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#404040","name":"user"}}],"text":"/home/admin/logs/acvip-java-client 目錄。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#404040","name":"user"}}],"text":"通過對比 stdout.log,也沒有發現什麼不同。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#404040","name":"user"}}],"text":"但是我們從運行成功的應用的 /home/admin/logs/acvip-java-client 目錄下面發現 STARTUP.log 的日誌,其記錄了 ACVIP 初始化過程。從這個日誌我們能看到 ACVIP 初始化的時間點。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#404040","name":"user"}}],"text":"招式四、時間回溯,逼近真相"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#404040","name":"user"}}],"text":"承接上文,我們拿到 ACVIP 初始化的時間點:"},{"type":"text","text":"2020-07-01T18:35:12,086。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/8b/8b0bedae20046d5869e0f18f3c1e860e.png","alt":"image.png","title":"image.png","style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#404040","name":"user"}}],"text":"拿到 ACVIP 初始化的時間點,有什麼用呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#404040","name":"user"}}],"text":"當信息太碎片化的時候,我們往往發現不了這些信息的價值。但是當我們找到一根主線將這些信息串聯起來,那我們很可能得到一個價值連城的故事。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#404040","name":"user"}}],"text":"現在,就是我們用主線將碎片化的信息串聯起來的時候了。而主線就是時間點。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲我們知道了 ACVIP 的初始化的時間點,那麼我們只需要關心這個時間點前發生的故事。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"回到 stdout.log,參考這個時間點,我們發現在那個時間點 SOFA 框架正在初始化 DRM (動態配置組件),但是有報錯。2020-07-01 18:35:12:at com.alipay.drm.client.remoting.client.ClientManager."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/f1/f14c708925a62d42b674a44f6d70c336.png","alt":"image.png","title":"image.png","style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個錯誤在正常運行和非正常運行的測試場景都有,而且 callstack 一模一樣。所以這個錯誤本身並不重要。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"重要的是,發生這個錯誤的時候,SOFA 框架正在初始化 DRM 組件。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接着,我們將重心轉移到 DRM 組件的初始化:查看運行正常的 /home/admin/logs/drm-boot.log 日誌。我們發現這一條 “Query zdrmdata url pool from antvip”,這說明 DRM 組件從 ACVIP 那邊獲得了 DRM 服務端的 IP。同時,我們注意到 DRM 初始化的時間是 2020-07-01 18:35:11,947,這比 ACVIP 的初始化時間晚!這說明是 DRM 觸發了 ACVIP 初始化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"bgcolor","attrs":{"color":"#FADB14","name":"user"}}],"text":"2020-07-01 18:35:11,947 INFO  Start building distributed resource ..."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"..."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"2020-07-01 18:35:11,991 INFO  Init access key from system param: middleWarexxxx"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"2020-07-01 18:35:11,991 INFO  Init secretKey key from system param."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"2020-07-01 18:35:11,991 INFO  Init instance id name from system param: xxxInstanceID"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"..."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"bgcolor","attrs":{"color":"#FADB14","name":"user"}}],"text":"2020-07-01 18:35:13,205 INFO  Query zdrmdata url pool from antvip:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接下來,讓我們再來看看非正常運行的應用的 DRM 日誌。該日誌有報錯:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"user"}}],"text":"“ERROR Query confreg http url failed!”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"bgcolor","attrs":{"color":"#FADB14","name":"user"}}],"text":"2020-07-01 19:38:44,511 INFO  Start building distributed resource ..."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"..."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"2020-07-01 19:38:44,557 INFO  Init access key from system param: middleWarexxxx"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"2020-07-01 19:38:44,557 INFO  Init secretKey key from system param."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"bgcolor","attrs":{"color":"#FADB14","name":"user"}}],"text":"2020-07-01 19:38:44,605 ERROR Query confreg http url failed!"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"假如我們盯着這個錯誤看,可能被誤導到一個錯誤的方向。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過對比兩者的 DRM 日誌,我們發現正常運行的應用的 DRM 日誌會 “"},{"type":"text","marks":[{"type":"italic"}],"text":"Init instance id name from system"},{"type":"text","text":"”,但是非正常運行的應用的 DRM 日誌卻沒有。而 instanceID 是 acvip 尋址所必須的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"招式五、研究代碼邏輯,重現問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從 DRM 日誌中,我們發現了 DRM 初始化邏輯軌跡的不同。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而日誌內容是通過代碼裏的 Logger 設置的。所以我們可以通過日誌內容去代碼裏看相應的邏輯(這是客戶端的代碼,客戶和我們都可以查看)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們通過在代碼裏面搜索 “"},{"type":"text","marks":[{"type":"italic"}],"text":"Init secretKey key from system param"},{"type":"text","text":"”,發現在加載 instance id 之前,代碼裏有一個判斷。這個判斷是,假如 antCloud 爲 True,纔會去加載 instance id。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個 antCloud 爲不爲 True 取決於配置文件中 com.alipay.env 的值是否爲 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"user"}}],"text":"shared"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/79/79910dbace792b0d1c36107d9525ad1b.png","alt":"image.png","title":"image.png","style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們的代碼裏面使用的是"},{"type":"text","marks":[{"type":"strong"}],"text":" equalsIgnoreCase"},{"type":"text","text":" 在做對比,"},{"type":"text","marks":[{"type":"strong"}],"text":"但是代碼裏面沒有考慮該值有空格的情況(沒有trim),所以當配置文件中 com.alipay.env 的值是 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"user"}},{"type":"strong"}],"text":"shared+空格"},{"type":"text","marks":[{"type":"strong"}],"text":",而不是 shared, 那麼就無法觸發 ACVIP 初始化了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了驗證我們的想法,我們重新檢查了 stdout.log。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"非正常運行的應用是這樣的:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"Not find key com.alipay.env in Java -D argument, put value shared  into System"},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#F5222D","name":"user"}}],"text":"(shared後多了一個空格)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"正常運行的應用是這樣的:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"Not find key com.alipay.env in Java -D argument, put value shared into System"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"直接從配置文件的截圖是看不出來是否 shared 後面有沒有空格的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們馬上在本地做了一個實驗,果然可以重現問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"解決方案"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"客戶將 application-test.properties 裏面的 shared 後面的空格去掉後,問題解決。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後,我將這幾個招式總結到下面的流程圖中。"}]},{"type":"image","attrs":{"src":"https://static001.infoq.cn/static/write/img/img-copy-disabled.4f2g7h.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章