HDiffPatch和BSDiff4.3&xdelta3.1的對比測試
作者: [email protected] 2013.06.06
tag: HDiffPatch,HDiff,HPatch,diff,patch,bsdiff,bspatch,xdelta,補丁,升級,差異
更新:2017.08.27
測試環境: 系統:macOS10.12.6 編譯器:Xcode8.3.3 x64 CPU:i7 2.5G(turbo3.7G,6MB共享L3緩存) 內存: 8G*2 DDR3 1600MHz
(x64模式編譯BsDiff4.3時, 手動將後綴數組索引類型int64修改爲int32,這樣運行更快一些,而且內存佔用減半!)
(HDiffPatch的說明和源代碼下載: http://blog.csdn.net/housisong/article/details/9003013 )
和BsDiff的對比: (文件數據都加載到內存)
HDiff2.0 diff used create_compressed_diff() + bzip2 | lzma | zlib , all data in memory;
patch used patch_decompress() + load oldFile data into memory, other data use file stream.
BsDiff4.3 with bzip2 and all data in memory;
(when compiling BsDiff4.3-x64, suffix string index type int64 changed to int32,
faster and memroy requires to be halved.)
===================================================================================================
Program Uncompressed Compressed Compressed BsDiff4.3 HDiff2.0
(newVersion<--oldVersion) (tar) (bzip2) (lzma) (bzip2) (bzip2 lzma zlib)
---------------------------------------------------------------------------------------------------
apache-maven-2.2.1-src <--2.0.11 5150720 1213258 1175464 115723 83935 80997 91921
httpd_2.4.4-netware-bin <--2.2.24 22612480 4035904 3459747 2192308 1809555 1616435 1938953
httpd-2.4.4-src <-- 2.2.24 31809536 4775534 4141266 2492534 1882555 1717468 2084843
Firefox-21.0-mac-en-US.app<--20.0 98740736 39731352 33027837 16454403 15749937 14018095 15417854
emacs-24.3 <-- 23.4 185528320 42044895 33707445 12892536 9574423 8403235 10964939
eclipse-java-juno-SR2-macosx
-cocoa-x86_64 <--x86_32 178595840 156054144 151542885 1595465 1587747 1561773 1567700
gcc-src-4.8.0 <--4.7.0 552775680 86438193 64532384 11759496 8433260 7288783 9445004
---------------------------------------------------------------------------------------------------
Average Compression 100.00% 31.76% 28.47% 6.64% 5.58% 5.01% 5.86%
===================================================================================================
===================================================================================================
Program run time(Second) memory(MB) run time(Second) memory(MB)
BsDiff HDiff BsDiff HDiff BsPatch HPatch2.0 BsPatch HPatch2.0
(bzip2)(bzip2) (bzip2)(bzip2) (bzip2) (bzip2 lzma zlib) (bzip2)(bzip2 lzma zlib)
---------------------------------------------------------------------------------------------------
apache-maven... 1.3 0.4 42 28 0.09 0.04 0.03 0.02 14 8 7 6
httpd bin... 8.6 3.0 148 124 0.72 0.36 0.18 0.13 50 24 23 18
httpd src... 20 5.1 322 233 0.99 0.46 0.24 0.17 78 44 42 37
Firefox... 94 28 829 582 3.0 2.2 1.2 0.57 198 106 106 94
emacs... 109 32 1400 1010 4.9 2.3 1.1 0.78 348 174 168 161
eclipse 100 33 1500 1000 1.5 0.56 0.57 0.50 350 176 174 172
gcc-src... 366 69 4420 3030 7.9 3.5 2.1 1.85 1020 518 517 504
---------------------------------------------------------------------------------------------------
Average 100% 28.9% 100% 71.5% 100% 52.3% 29.9% 21.3% 100% 52.3% 50.3% 45.5%
===================================================================================================
對於GB爲單位的超大文件來說,hdiff和bsdiff算法的內存佔用和時間都很難讓人接受;這時就需要一種能夠控制內存佔用和時間可接受的算法,爲此hdiff庫提供了一個新的create_compressed_diff_stream()函數來解決這個問題(與create_compressed_diff()的輸出兼容),提供超大文件高性能的diff解決方案。
和xdelta3.1的對比:
(xdelta3.1在diff "gcc-src..." 時失效, 添加 -B 530000000 參數後diff結果才正常,輸出 11787978字節,但這時使用了2639MB內存!)
HDiff2.1.3 diff used create_compressed_diff_stream()+bzip2,kMatchBlockSize=128, all data use
file stream; patch used patch_decompress(), all data use file stream.
xdelta3.1 diff run by: -e -s old_file new_file delta_file
patch run by: -d -s old_file delta_file decoded_new_file
(note fix: xdelta3.1 diff "gcc-src..." fail, add -B 530000000 diff ok,
out 14173073B and used 1070MB memory.)
===================================================================================================
Program diff run time(Second) memory(MB) patch run time(Second) memory(MB)
xdelta3 HDiff xdelta3 HDiff xdelta3 HDiff xdelta3 HPatch2.0 xdelta3 HPatch
---------------------------------------------------------------------------------------------------
apache-maven... 116265 83408 0.16 0.13 65 11 0.07 0.06 12 6
httpd bin... 2174098 2077625 1.1 1.2 157 15 0.25 0.65 30 8
httpd src... 2312990 2034666 1.3 1.7 185 15 0.30 0.91 50 8
Firefox... 28451567 27504156 16 11 225 16 2.0 4.1 100 8
emacs... 31655323 12033450 19 9.4 220 33 3.2 4.0 97 10
eclipse 1590860 1636221 1.5 1.2 207 34 0.46 0.49 77 8
gcc-src... 107003829 12305741 56 19 224 79 9.7 9.5 102 11
(fix 14173073)
---------------------------------------------------------------------------------------------------
Average 12.18% 7.81% 100% 79.0% 100% 15.5% 100% 169.1% 100% 18.9%
(fix 9.78%)
===================================================================================================
HDiff2.1.3 diff used create_compressed_diff_stream()+lzma,kMatchBlockSize=64, all data use
file stream; patch used patch_decompress(), all data use file stream.
xdelta3.1 diff run by: -S lzma -9 -s old_file new_file delta_file
patch run by: -d -s old_file delta_file decoded_new_file
(note fix: xdelta3.1 diff "gcc-src..." fail, add -B 530000000 diff ok,
out 11787978B and used 2639MB memory.)
===================================================================================================
Program diff run time(Second) memory(MB) patch run time(Second) memory(MB)
xdelta3 HDiff xdelta3 HDiff xdelta3 HDiff xdelta3 HPatch2.0 xdelta3 HPatch
---------------------------------------------------------------------------------------------------
apache-maven... 98434 83668 0.37 0.29 220 24 0.04 0.06 12 5
httpd bin... 1986880 1776553 2.5 2.9 356 59 0.24 0.52 30 8
httpd src... 2057118 1794029 3.3 4.2 375 62 0.28 0.78 50 8
Firefox... 27046727 21882343 27 32 416 76 1.8 2.2 100 9
emacs... 29392254 9698236 38 32 413 97 3.1 2.9 97 9
eclipse 1580342 1589045 3.0 1.9 399 76 0.48 0.48 77 6
gcc-src... 95991977 9118368 128 44 417 148 8.9 8.6 102 11
(fix 11787978)
---------------------------------------------------------------------------------------------------
Average 11.24% 6.44% 100% 88.9% 100% 20.0% 100% 151.1% 100% 17.3%
(fix 9.06%)
===================================================================================================
HDiff比BsDiff 生成的diff數據更小(一般小15%以上),速度更快(一般只需要1/4多時間),佔用內存更小(一般只需要2/3內存);
而在資源限制可能性更大的patch階段,HPatch比BsPatch運行的更快(幾分之一時間),內存佔用大多時候也小的多! (hpatch也支持限制內存佔用的調用模式,速度會慢些)
HDiff新函數比xdelta 可以看到在兩種參數下都得到了類似的對比結果:
diff的結果小很多(約25%以上),執行時間相當,但內存佔用只有1/5以下;
patch時執行時間比xdelta慢(約50%以上),但內存佔用只有1/5以下;(hpatch也支持使用較多內存的調用模式,見和bspatch的對比測試,速度會比xdelta快很多)