Superpack:突破Facebook移動應用程序的壓縮極限

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在Facebook上管理應用程序的大小是一個獨特的挑戰:開發者每天都要檢查大量的代碼,每行代碼最終都會轉化爲人們下載到手機上的應用程序中的附加位。如果不加檢查,這些添加的代碼會使應用程序越來越大,直到下載應用程序所需的時間變得不可接受。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"壓縮是我們用來保持應用程序大小最小化的方法之一。這些壓縮過的文件佔用更少的空間,這意味着更小的應用程序下載地更快,全球數十億用戶使用更少的帶寬。在移動寬帶有限的地區,這樣的節省尤其重要,因爲有限的帶寬會使下載大型應用程序的花費很高。但單靠壓縮還不足以跟上我們所做的所有更新和添加到應用程序中各種功能的步伐。因此,我們開發了一種稱爲“Superpack”的技術,它將編譯器分析和數據壓縮相結合,開拓出超越傳統壓縮工具能力的優化。Superpack突破了壓縮極限,實現了比現有壓縮工具更好的壓縮率。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在過去兩年中,Superpack已經能夠控制開發人員導致的應用程序大小增長,並保持我們的Android應用程序小型化。Superpack的壓縮有助於減小我們的Android應用程序羣的大小,這與常規Android APK壓縮相比要小得多,與Android的默認Zip壓縮相比節省了20%以上空間。使用Superpack的應用程序包括Facebook、Instagram、WhatsApp和Messenger。這些應用程序由於Superpack而減小的大小如下表所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/b3\/b3688c725122072d5bb62d7522e1e736.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/35\/35a1aa2e1d7e9694ca6c848f54608d40.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"Superpack:編譯器+數據壓縮"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"雖然現有的壓縮算法,例如Zip的Deflat和Xz的LZMA,能夠很好地處理大型單體數據,但它們不足以抵消我們在應用程序中看到的增長速度,因此我們開始開發自己的解決方案。壓縮是一個成熟的領域,我們開發的技術跨越了整個壓縮領域,從數據壓縮和Lempel-Ziv(LZ)解析到統計編碼。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Superpack的優勢在於壓縮編碼,如機器碼和字節碼,以及其它類型的結構化數據。Superpack的底層方法是基於"},{"type":"link","attrs":{"href":"https:\/\/www.tandfonline.com\/doi\/abs\/10.1080\/00207166808803030","title":null,"type":null},"content":[{"type":"text","text":"Kolmogorov的算法複雜性度量"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"的想法,將一段數據的信息內容定義爲生成該數據的最短程序的長度。換句話說,可以通過將數據表示成能夠生成這段數據的程序來壓縮數據。當數據是代碼時,可以將其轉換成更小的壓縮後的表示。生成斐波那契數列及其索引列表的程序,是包含這些數列的文件的高度壓縮表示。降低Kolmogorov複雜性本身的想法對於壓縮領域並不新鮮。Superpack的新方法涉及將編譯器方法與現代壓縮技術相結合來實現這一目標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"將生成小型程序的生成過程作爲壓縮的形式有很大的好處。這爲數據壓縮工程師提供了一系列成熟的編譯工具和技術,這些工具和技術可以更改用途進行壓縮。Superpack壓縮利用常見的編譯器技術,例如解析和代碼生成,以及最近的創新,例如 "},{"type":"link","attrs":{"href":"https:\/\/dl.acm.org\/doi\/10.1145\/1995376.1995394","title":null,"type":null},"content":[{"type":"text","text":"Satisfiability modulo theories (SMT) 求解器"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",來找到最小的程序。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"能夠將這些編譯器技術與主流數據壓縮中使用的技術結合起來,這是Superpack有效性的一個重要組成部分。來自編譯器的語義知識佔了Superpack的一半,造就增強的LZ解析(消除冗餘的壓縮步驟),以及改進的熵編碼(爲頻繁的信息片段生成短編碼的步驟)。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"改進的LZ解析"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"壓縮器通常使用從LZ家族中選擇的算法來識別重複的字節序列。大體上,每一個這樣的算法都試圖用指向以前出現的數據的指針來替換重複出現的數據序列。這個指針由上一次出現的距離(字節數)和序列長度組成。如果這個指針可以用比實際數據更少的位表示,則可以用之替換作爲壓縮大小。Superpack通過發現更長重複序列,同時減少表示指針的位數,從而改進了LZ解析過程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在被壓縮的程序中,Superpack通過基於AST對數據進行分組來實現這些改進。例如,在以下指令序列中,最長重複序列的長度爲2。然而,當根據AST類型(即操作碼和寄存器,小表中的組1)和立即數(下表中的組2)進行分組時,長度增加到4。在原始數據的原始解析中,重複序列之間的距離是2條指令。但在分組後的版本中,距離爲0。更小的距離通常使用更少的位數,更長的序列匹配通過在給定指針中捕獲更多的輸入數據來節省空間。因此,Superpack生成的指針比簡單計算的指針要小。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/58\/5898e1d3f2794b959de26c0c75fcef8c.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"但是,我們如何決定何時分隔代碼流以及何時保持原封不動?Superpack最近的工作中引入了分層壓縮,這將此決策合併到LZ解析的優化組件中,稱爲最優解析。在下面編輯過的代碼中,最好將代碼段的最後一段保留爲原始形式,並生成一個指向前五條指令的指針的匹配項,同時拆分代碼段的其餘部分。在拆分餘數中,利用寄存器組合的稀疏性來生成更長的匹配。以這種方式對代碼進行分組還可以通過計算重複序列之間的邏輯單元數量(沿AST測量)而不是測量字節數,來進一步縮短距離。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ed\/ed3217de2eb376604c0514dea14fd84e.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"改進的熵編碼"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"重複的字節序列被指向上一次出現的序列的指針有效地替換。但是壓縮器對非重複序列或比指針表示更短的短序列能做些什麼呢?在這種情況下,壓縮器通過對數據中的值進行編碼來表示數據。用來表示序列的位數,利用了序列可以假定的值的分佈。熵編碼是用數據中的值的熵的位數來表示一個值的過程。壓縮器爲此目的使用的一些衆所周知的技術包括哈夫曼編碼、算術編碼、距離編碼和非對稱數字系統(asymmetircal numberal systems,ANS)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Superpack有一個內置的ANS編碼器,還有一個可插拔的架構,支持多個這樣的編碼後端。Superpack通過識別上下文(其中要表示的序列由較低的熵)來改進熵編碼。與LZ解析類似,這些上下文是從Superpack對通過編譯器分析提取的數據結構的瞭解中派生出來的。在下面簡化的指令序列中,有七個不同的地址,每個地址都有前綴0x。在該編碼的大量不同排列中,常規編碼器用於表示地址字段的位數將接近3。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"然而,我們注意到,七個地址中有三個與BL操作碼配對,而另外三個與B操作碼關聯。只有一個地址與兩者都耦合。如果這個模式在整個代碼體中都成立,那麼操作碼可以用作編碼上下文。在這種情況下,表示這七個地址的位數接近2而不是3。下表顯示了帶上下文和不帶上下文的編碼。在第三列中的Superpack壓縮情況下,可以將操作碼視爲預測丟失的位。這個簡單的示例旨在說明如何使用編譯器上下文來改進編碼。在實際數據中,獲得的位數通常是分數,上下文和數據之間的映射很少像本例中那樣直接。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/d5\/d54d5a800b597c8489a7953a64a42df5.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"作爲壓縮表示的程序"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們解釋了當被壓縮的數據由代碼組成時,Superpack如何改進LZ解析和熵編碼。但當數據包含非結構化值時會發生什麼?在這種情況下,Superpack試圖通過在壓縮時將值轉換爲程序來添加值結構。然後,在解壓時,將程序進行解析來恢復原始數據。這種技術的一個例子是Dex索引的壓縮,Dex索引是Dex編碼中已知值的標籤。Dex索引具有高度的局部性。爲了利用這種局部性,我們將索引轉換爲一種將最近的值存儲在邏輯寄存器中的語言,並將即將出現的值作爲固定值的增量發佈。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/34\/34cd27f240387d0e2e5db17c9364f3bb.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲這種表示編寫一個高效的壓縮器會導致編譯器中常見的寄存器分配問題,該問題決定何時從寄存器中收回舊值來加載新值。雖然這種減少是針對索引字節碼的,但一個通用的想法適用於任何字節碼錶示,即,生成的代碼符合前兩節中概述的優化。在本例中,LZ解析通過將操作碼、MOV和PIN放在一個組中、在第二個組中收集增量、以及在第三個組中收集最近的索引而得到改進。"}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"基於真實數據的Superpack"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Superpack有3個主要的有效載荷目標。第一個是Dex字節碼,在Android應用程序中Java被編譯成的格式。第二個是ARM機器碼,這是針對ARM處理器編譯的代碼。第三個是Hermes字節碼,它是在Facebook創建的JavaScript的專用高性能字節碼錶示。所有這三種表示都使用了全方位的Superpack技術,這些技術由基於代碼語法和語法知識的編譯器分析提供支持。在這三種情況下,有一組壓縮轉換應用於指令流,另一組壓縮轉換應用於元數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"應用於代碼的轉換都是相似的。元數據轉換有兩部分。第一部分通過按類型進行分組來利用數據的結構。第二部分利用元數據規範中的組織規則,例如導致對數據進行排序或公開可用於上下文距離和序列項之間的相關性的規則。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Zip、Xz和Superpack針對這三種格式的壓縮率如下表所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/5b\/5b397847980f550d39f5f1e77970d9a0.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"Superpack架構和實現"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Superpack是壓縮領域的一個獨特玩家,因爲它包含有關壓縮數據的類型知識。爲了在Facebook推廣Superpack的開發和使用,我們開發了一個模塊化設計,其中的抽象可以跨不同的壓縮格式使用。Superpack的架構類似於操作系統,其內核實現分頁內存分配、文檔和歸檔抽象、用於轉換和操作指令的抽象,以及可插拔模塊的接口。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"面向編譯器的機制屬於專門的編譯器層。每種格式都實現爲一個可插拔的驅動程序。驅動程序利用被壓縮數據的屬性,並在代碼中標記相關性,最終被壓縮層利用。解析輸入代碼的機器使用基於SMT解析器的自動推理。我們如何使用SMT求解器來幫助壓縮超出了本文的範圍,將成爲未來一篇博文的有趣話題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"壓縮層還包括可插拔模塊。其中一個模塊是Superpack自己的壓縮器,包括一個定製的LZ引擎和一個熵編碼後端。在我們構建這個壓縮器的過程中,我們插入了利用現有壓縮工具進行壓縮工作的模塊。在該裝置中,Superpack的角色簡化爲將數據重新組織爲不相關的流。隨後,現有工具會盡最大努力進行壓縮,這是有效的,但在識別和使用編譯器信息的粒度上受到限制。Superpack的定製壓縮後端通過數據的內部表示的細粒度視圖解決了這個問題,這使它能夠利用單個位的細粒度的邏輯相關性。將用於執行壓縮工作的機制抽象爲一個模塊,可以讓我們在壓縮率和解壓速度之間選擇一些平衡。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/5e\/5e40bf5d3992048749ff57a9dba04a4e.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Superpack的實現包含用OCaml編程語言編寫的代碼和C語言代碼的混合。OCaml在壓縮端用於操作複雜的面向編譯器的數據結構,並與SMT求解器進行接口對接。C語言是解壓邏輯的自然選擇,因爲它往往很簡單,同時對解壓代碼運行的處理器的參數高度敏感,例如一級緩存的大小。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"限制和相關工作"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Superpack是一個非對稱壓縮器,這意味着解壓速度很快,而壓縮速度可以很慢。流式壓縮,即數據以其傳輸速率進行壓縮,一直不是Superpack的目標。Superpack無法滿足流式壓縮的約束條件,因爲其當前的壓縮速度跟不上現代數據傳輸速率。Superpack已經應用於結構化數據、代碼、整數和字符串數據。Superpack當前不以圖像、視頻或音頻文件爲目標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在Android平臺上,在使用壓縮來減少下載時間和可能增加磁盤佔用和更新大小之間存在一種平衡。這種平衡不是Superpack的限制,而是Facebook使用的打包工具和Android上使用的分發工具之間尚未建立互操作性。例如,在Android上,應用程序更新是作爲應用程序連續版本內容之間的增量發佈的。但這種增量只能由能夠解壓和重新壓縮應用程序內容的工具生成。由於當前工具中實現的差異對比過程無法解析Superpack文檔,因此對於包含此類文件的應用程序,增量會變得更大。我們相信,這類問題可以通過Superpack和Android工具之間更細粒度的接口、Android分發機制中更高的可定製性以及Superpack文件格式和壓縮方法的公開文檔來解決。Facebook的應用程序主要由Superpack擅長壓縮的代碼組成,其壓縮方式遠遠超過了Android上Google Play實現的現有壓縮方式。因此,就目前來說,我們的壓縮對我們的用戶來說是有益的,儘管存在權衡。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Superpack利用Jarek Duda在"},{"type":"link","attrs":{"href":"https:\/\/arxiv.org\/abs\/0902.0271","title":null,"type":null},"content":[{"type":"text","text":"非對稱數字化系統"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"上的工作作爲其熵編碼後端。Superpack借鑑了其“"},{"type":"link","attrs":{"href":"https:\/\/dl.acm.org\/doi\/abs\/10.1145\/36177.36194","title":null,"type":null},"content":[{"type":"text","text":"超優化(superoptimization)"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"”思想,以及過去在"},{"type":"link","attrs":{"href":"https:\/\/dl.acm.org\/doi\/abs\/10.1145\/258916.258947","title":null,"type":null},"content":[{"type":"text","text":"代碼壓縮"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"方面的工作。它利用"},{"type":"link","attrs":{"href":"https:\/\/tukaani.org\/xz\/","title":null,"type":null},"content":[{"type":"text","text":"Xz"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"、"},{"type":"link","attrs":{"href":"https:\/\/facebook.github.io\/zstd\/","title":null,"type":null},"content":[{"type":"text","text":"Zstd"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"和"},{"type":"link","attrs":{"href":"https:\/\/github.com\/google\/brotli","title":null,"type":null},"content":[{"type":"text","text":"Brotli"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"壓縮器作爲可選後端來完成壓縮工作。最後,Superpack使用微軟的"},{"type":"link","attrs":{"href":"https:\/\/github.com\/Z3Prover\/z3","title":null,"type":null},"content":[{"type":"text","text":"Z3 SMT求解器"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"來自動解析和重構各種代碼格式。"}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"下一步計劃"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Superpack結合了編譯器和數據壓縮技術,以一種特別適用於代碼(例如,Dex字節碼和ARM機器碼)的方式增加打包數據的密度。Superpack大幅縮減了Android應用程序的大小,從而爲全球數十億用戶節省了下載時間。我們已經描述了Superpack背後的一些核心思想,但只觸及了我們在不對稱壓縮方面的工作的表面。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們的旅程纔剛剛開始。Superpack通過對其編譯器和壓縮組件的增強來不斷改進。Superpack最初是作爲一種工具來減少移動應用程序的大小,但我們在提高各種數據類型的壓縮率方面的成功,使我們將目標對準了非對稱壓縮的其它用例。我們正在開發一種新的按需可執行文件格式,通過在加載時保留壓縮和解壓共享的庫來節省磁盤空間。我們正在評估使用Superpack對代碼進行增量壓縮來減少軟件更新的大小。我們還在研究將Superpack用作冷存儲壓縮器,以壓縮很少使用的日誌數據和文件。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"到目前爲止,我們的移動部署僅限於Android應用程序。然而,我們的工作也同樣適用於其它平臺,例如iOS,而且我們也正在考慮將我們的實現移植到這些平臺。目前,只有我們的工程師可以使用Superpack,但我們渴望將Superpack的好處帶給每一個人。爲此,我們正在探索提高我們的壓縮工作與Android生態系統兼容性的方法。這篇博文是朝着這個方向邁出的一步。有朝一日我們可能考慮將Superpack開源。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們要特別感謝Alfredo Altaminaro、Nikhil Prakash、Mauricio Nunes和所有爲Superpack做出貢獻的人。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文鏈接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/engineering.fb.com\/2021\/09\/13\/core-data\/superpack\/","title":null,"type":null},"content":[{"type":"text","text":"Superpack: Pushing the limits of compression in Facebook’s mobile apps"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章