繞過 Docker ,大規模殺死容器

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"導讀"},{"type":"text","text":":Replit 是一種基於瀏覽器的集成開發環境(IDE),用於跨平臺協作編碼,已在 A.Capital Ventures 的 A 輪融資中籌集了 2000 萬美元。Replit 工程師在本文中爲我們介紹了他們如何在 Replit 給用戶提供更流暢的體驗:大規模殺死容器。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要讓所有人都能在 Replit 上使用 Web 瀏覽器編寫代碼,我們的後端基礎設施就是在可搶佔的虛擬機上運行。也就是說,運行你代碼的計算機可以隨時關閉!當這種情況發生時,我們就用 REPL(Read-Eval-Print Loop,讀取 - 求值 - 輸出循環)快速重新連接。雖然我們已經盡了最大的努力,但人們還是會發現 REPL 連接被卡了很久。通過分析和挖掘 Docker 源代碼,我們發現並解決了這一問題。我們的會話連接錯誤率從 3% 降到了 0.5% 以下,99 百分位會話啓動時間從 2 分鐘降到了 15 秒。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"造成 REPL 卡死有多種原因,其中有機器故障、競爭條件導致死鎖、容器關機慢等原因。本文主要介紹我們如何修復最後一個原因,即容器關機速度慢。緩慢的容器關機幾乎影響到每個使用該平臺的人,並導致 REPL 無法訪問長達一分鐘。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Replit 架構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你需要對 Replit 的架構有一些瞭解,然後才能深入研究如何解決容器關機緩慢的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"打開 REPL 後,瀏覽器將打開 websocket,將其連接到在可搶佔虛擬機上運行的 Docker 容器。每一臺虛擬機都運行着我們稱爲"},{"type":"codeinline","content":[{"type":"text","text":"conman"}]},{"type":"text","text":"的東西,這是容器管理器(container manager)的簡稱。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要確保每一個 REPL 在任何時候都只有一個單一的容器。容器被設計用於促進多人遊戲的功能,因此 REPL 的重要性在於, REPL 中的每個用戶都連接到同一個容器。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當託管這些 Docker 容器的機器關機時,我們必須等待每個容器都被銷燬,然後才能在其他機器上再次啓動它們。這一過程經常發生,因爲我們使用的是可搶佔實例。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以下是嘗試在 mid-shutdown 實例上訪問 REPL 的典型流程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/bd\/bd7270cb6510f3f7808ce4995e7db177.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"用戶打開他們的 REPL,該 REPL 打開 IDE,然後嘗試通過 WebSocket 連接到後端評估服務器。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"該請求命中負載均衡器,負載均衡器根據 CPU 使用情況選擇一個 conman 實例作爲代理。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個健康的、運行的 conman 收到了這個請求。conman 注意到,該請求是針對一個存在於不同 conman 上的容器的,並在那裏代理該請求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"遺憾的是,這個 conman 關閉了 WebSocket 連接並且拒絕了!"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"該請求將一直失敗,直到:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"docker 容器被關閉,全局存儲中的 REPL 容器項被刪除。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"conman 完成關閉,不再能訪問。在這種情況下,第一個 conman 將刪除舊的 REPL 容器項,並啓動一個新的容器。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"容器關機緩慢"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在強制終止可搶佔虛擬機之前,將有 30 秒的時間完全關閉虛擬機。通過研究,我們發現,很少能在 30 秒內完成關機。因此,我們必須進一步研究並檢測機器關機例程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過添加有關機器關機的日誌和指標,顯然"},{"type":"codeinline","content":[{"type":"text","text":"docker kill"}]},{"type":"text","text":"被調用的時間比預期要長得多。正常運行時,"},{"type":"codeinline","content":[{"type":"text","text":"docker kill"}]},{"type":"text","text":"殺死 REPL 容器通常只需幾毫秒,但是,在關機期間,我們同時殺死 100~200 個容器卻要花費 20 多秒的時間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Docker 提供了兩種停止容器的方法:"},{"type":"codeinline","content":[{"type":"text","text":"docker stop"}]},{"type":"text","text":"和"},{"type":"codeinline","content":[{"type":"text","text":"docker kill"}]},{"type":"text","text":"。Docker stop 會向容器發送一個"},{"type":"codeinline","content":[{"type":"text","text":"SIGTERM"}]},{"type":"text","text":"信號,並給容器一個寬限期,讓它優雅地關機。如果容器沒有在寬限期內關機,就會向容器發送"},{"type":"codeinline","content":[{"type":"text","text":"SIGKILL"}]},{"type":"text","text":"。我們並不在乎寬限期關閉容器,而是希望"},{"type":"codeinline","content":[{"type":"text","text":"docker kill"}]},{"type":"text","text":"發送"},{"type":"codeinline","content":[{"type":"text","text":"SIGKILL"}]},{"type":"text","text":",這樣它就會立即殺死容器。出於某些原因,"},{"type":"codeinline","content":[{"type":"text","text":"docker kill"}]},{"type":"text","text":"並不能在幾秒鐘內完成容器的"},{"type":"codeinline","content":[{"type":"text","text":"SIGKILL"}]},{"type":"text","text":",這一理論與現實不符,肯定還有別的原因。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要深入探討這個問題,這裏有一個腳本,可以創建 200 個 docker 容器,同時計算出需要多長時間才能殺死它們。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"c#"},"content":[{"type":"text","text":"#!\/bin\/bash\nCOUNT=200\necho \"Starting $COUNT containers...\"\nfor i in $(seq 1 $COUNT); do\nprintf .\ndocker run -d --name test-$i nginx > \/dev\/null 2>&1\ndone\necho -e \"\\nKilling $COUNT containers...\"\ntime $(docker kill $(docker container ls -a --filter \"name=test\" --format \"{{.ID}}\") > \/dev\/null 2>&1)\necho -e \"\\nCleaning up...\"\ndocker rm $(docker container ls -a --filter \"name=test\" --format \"{{.ID}}\") > \/dev\/null 2>&1\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於生產中運行的同一類型的虛擬機,即 GCEn1-highmem-4 實例,將會生成如下結果:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"bash"},"content":[{"type":"text","text":"Starting 200 containers...\n................................\nKilling 200 containers...\nreal 0m37.732s\nuser 0m0.135s\nsys 0m0.081s\nCleaning up...\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們認爲, Docker 運行時發生了一些內部事件,導致關機速度非常緩慢,這就證實了我們的懷疑。現在要挖掘 Docker 本身。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Docker 守護進程有一個"},{"type":"link","attrs":{"href":"https:\/\/docs.docker.com\/config\/daemon\/#enable-debugging?fileGuid=8eRR44o7rCEb1xBs","title":"","type":null},"content":[{"type":"text","text":"啓用調試日誌記錄"}]},{"type":"text","text":"的選項。通過這些日誌,我們可以瞭解 dockerd 內部發生了什麼,並且每個條目都有一個時間戳,因此可以對這些時間所花費的位置提供一些信息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在啓用了調試日誌之後,讓我們重新運行腳本,看看 dockerd 的日誌。因爲要處理的容器有 200 個,它會輸出大量的日誌信息,所以我手工選擇了一些有意義的日誌。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"bash"},"content":[{"type":"text","text":"2020-12-04T04:30:53.084Z dockerd Calling GET \/v1.40\/containers\/json?all=1&filters=%7B%22name%22%3A%7B%22test%22%3Atrue%7D%7D\n2020-12-04T04:30:53.084Z dockerd Calling HEAD \/_ping\n2020-12-04T04:30:53.468Z dockerd Calling POST \/v1.40\/containers\/33f7bdc9a123\/kill?signal=KILL\n2020-12-04T04:30:53.468Z dockerd Sending kill signal 9 to container 33f7bdc9a1239a3e1625ddb607a7d39ae00ea9f0fba84fc2cbca239d73c7b85c\n2020-12-04T04:30:53.468Z dockerd Calling POST \/v1.40\/containers\/2bfc4bf27ce9\/kill?signal=KILL\n2020-12-04T04:30:53.468Z dockerd Sending kill signal 9 to container 2bfc4bf27ce93b1cd690d010df329c505d51e0ae3e8d55c888b199ce0585056b\n2020-12-04T04:30:53.468Z dockerd Calling POST \/v1.40\/containers\/bef1570e5655\/kill?signal=KILL\n2020-12-04T04:30:53.468Z dockerd Sending kill signal 9 to container bef1570e5655f902cb262ab4cac4a873a27915639e96fe44a4381df9c11575d0\n...\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這裏,我們可以看到殺死每個容器的請求,並且"},{"type":"codeinline","content":[{"type":"text","text":"SIGKILL"}]},{"type":"text","text":"幾乎是立即發送到每個容器。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以下是執行"},{"type":"codeinline","content":[{"type":"text","text":"docker kill"}]},{"type":"text","text":"後 30 秒左右看到的一些日誌記錄:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"shell"},"content":[{"type":"text","text":"...\n2020-12-04T04:31:32.308Z dockerd Releasing addresses for endpoint test-1's interface on network bridge\n2020-12-04T04:31:32.308Z dockerd ReleaseAddress(LocalDefault\/172.17.0.0\/16, 172.17.0.2)\n2020-12-04T04:31:32.308Z dockerd Released address PoolID:LocalDefault\/172.17.0.0\/16, Address:172.17.0.2 Sequence:App: ipam\/default\/data, ID: LocalDefault\/172.17.0.0\/16, DBIndex: 0x0, Bits: 65536, Unselected: 65529, Sequence: (0xfa000000, 1)->(0x0, 2046)->(0x1, 1)->end Curr:202\n2020-12-04T04:31:32.308Z dockerd Releasing addresses for endpoint test-5's interface on network bridge\n2020-12-04T04:31:32.308Z dockerd ReleaseAddress(LocalDefault\/172.17.0.0\/16, 172.17.0.6)\n2020-12-04T04:31:32.308Z dockerd Released address PoolID:LocalDefault\/172.17.0.0\/16, Address:172.17.0.6 Sequence:App: ipam\/default\/data, ID: LocalDefault\/172.17.0.0\/16, DBIndex: 0x0, Bits: 65536, Unselected: 65530, Sequence: (0xda000000, 1)->(0x0, 2046)->(0x1, 1)->end Curr:202\n2020-12-04T04:31:32.308Z dockerd Releasing addresses for endpoint test-3's interface on network bridge\n2020-12-04T04:31:32.308Z dockerd ReleaseAddress(LocalDefault\/172.17.0.0\/16, 172.17.0.4)\n2020-12-04T04:31:32.308Z dockerd Released address PoolID:LocalDefault\/172.17.0.0\/16, Address:172.17.0.4 Sequence:App: ipam\/default\/data, ID: LocalDefault\/172.17.0.0\/16, DBIndex: 0x0, Bits: 65536, Unselected: 65531, Sequence: (0xd8000000, 1)->(0x0, 2046)->(0x1, 1)->end Curr:202\n2020-12-04T04:31:32.308Z dockerd Releasing addresses for endpoint test-2's interface on network bridge\n2020-12-04T04:31:32.308Z dockerd ReleaseAddress(LocalDefault\/172.17.0.0\/16, 172.17.0.3)\n2020-12-04T04:31:32.308Z dockerd Released address PoolID:LocalDefault\/172.17.0.0\/16, Address:172.17.0.3 Sequence:App: ipam\/default\/data, ID: LocalDefault\/172.17.0.0\/16, DBIndex: 0x0, Bits: 65536, Unselected: 65532, Sequence: (0xd0000000, 1)->(0x0, 2046)->(0x1, 1)->end Curr:202\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這些日誌並不能全面說明 dockerd 所做的一切工作,但是它讓人感覺 dockerd 可能花費了大量時間來釋放網絡地址。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"到了這個時候,我決定要開始挖掘 docker 引擎的源代碼,創建自己的 dockerd 版本,並添加一些額外的日誌記錄。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先找出處理容器終止請求的代碼路徑。我增加了一些額外的日誌信息,這些信息包含不同長度的時間,最後我發現這些時間都用在:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"該引擎會將"},{"type":"codeinline","content":[{"type":"text","text":"SIGKILL"}]},{"type":"text","text":"發送到容器,然後等待容器停止運行纔對 HTTP 請求作出響應。("},{"type":"link","attrs":{"href":"https:\/\/github.com\/docker\/engine\/blob\/ab373df1125b6002603456fd7f554ef370389ad9\/daemon\/kill.go#L174?fileGuid=8eRR44o7rCEb1xBs","title":"","type":null},"content":[{"type":"text","text":"來源"}]},{"type":"text","text":")。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"bash"},"content":[{"type":"text","text":"
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章