HDiffPatch和BsDiff4.3&xdelta3.1的對比測試

HDiffPatch和BSDiff4.3&xdelta3.1的對比測試

作者: [email protected]   2013.06.06

 

tag: HDiffPatch,HDiff,HPatch,diff,patch,bsdiff,bspatch,xdelta,補丁,升級,差異

 

更新:2017.08.27

測試環境:  系統:macOS10.12.6 編譯器:Xcode8.3.3 x64 CPU:i7 2.5G(turbo3.7G,6MB共享L3緩存) 內存: 8G*2 DDR3 1600MHz

  (x64模式編譯BsDiff4.3時, 手動將後綴數組索引類型int64修改爲int32,這樣運行更快一些,而且內存佔用減半!)

(HDiffPatch的說明和源代碼下載: http://blog.csdn.net/housisong/article/details/9003013 )

和BsDiff的對比:  (文件數據都加載到內存)

HDiff2.0 diff used create_compressed_diff() + bzip2 | lzma | zlib , all data in memory;
         patch used patch_decompress() + load oldFile data into memory, other data use file stream.
BsDiff4.3 with bzip2 and all data in memory;
(when compiling BsDiff4.3-x64, suffix string index type int64 changed to int32, 
   faster and memroy requires to be halved.)   
===================================================================================================
         Program               Uncompressed Compressed Compressed BsDiff4.3        HDiff2.0
(newVersion<--oldVersion)           (tar)     (bzip2)   (lzma)    (bzip2)   (bzip2   lzma     zlib)
---------------------------------------------------------------------------------------------------
apache-maven-2.2.1-src <--2.0.11    5150720   1213258   1175464   115723    83935    80997    91921
httpd_2.4.4-netware-bin <--2.2.24  22612480   4035904   3459747  2192308  1809555  1616435  1938953
httpd-2.4.4-src <-- 2.2.24         31809536   4775534   4141266  2492534  1882555  1717468  2084843
Firefox-21.0-mac-en-US.app<--20.0  98740736  39731352  33027837 16454403 15749937 14018095 15417854
emacs-24.3 <-- 23.4               185528320  42044895  33707445 12892536  9574423  8403235 10964939
eclipse-java-juno-SR2-macosx
  -cocoa-x86_64 <--x86_32         178595840 156054144 151542885  1595465  1587747  1561773  1567700
gcc-src-4.8.0 <--4.7.0            552775680  86438193  64532384 11759496  8433260  7288783  9445004
---------------------------------------------------------------------------------------------------
Average Compression                 100.00%    31.76%    28.47%    6.64%    5.58%    5.01%    5.86%
===================================================================================================

===================================================================================================
   Program   run time(Second)   memory(MB)       run time(Second)              memory(MB)
               BsDiff HDiff   BsDiff  HDiff   BsPatch      HPatch2.0       BsPatch    HPatch2.0
              (bzip2)(bzip2)  (bzip2)(bzip2)  (bzip2) (bzip2  lzma  zlib)  (bzip2)(bzip2 lzma zlib)
---------------------------------------------------------------------------------------------------
apache-maven...  1.3   0.4       42     28      0.09    0.04  0.03  0.02      14      8     7     6
httpd bin...     8.6   3.0      148    124      0.72    0.36  0.18  0.13      50     24    23    18
httpd src...    20     5.1      322    233      0.99    0.46  0.24  0.17      78     44    42    37
Firefox...      94    28        829    582      3.0     2.2   1.2   0.57     198    106   106    94
emacs...       109    32       1400   1010      4.9     2.3   1.1   0.78     348    174   168   161
eclipse        100    33       1500   1000      1.5     0.56  0.57  0.50     350    176   174   172
gcc-src...     366    69       4420   3030      7.9     3.5   2.1   1.85    1020    518   517   504
---------------------------------------------------------------------------------------------------
Average        100%   28.9%    100%   71.5%     100%   52.3% 29.9% 21.3%    100%  52.3% 50.3% 45.5%
===================================================================================================

 

對於GB爲單位的超大文件來說,hdiff和bsdiff算法的內存佔用和時間都很難讓人接受;這時就需要一種能夠控制內存佔用和時間可接受的算法,爲此hdiff庫提供了一個新的create_compressed_diff_stream()函數來解決這個問題(與create_compressed_diff()的輸出兼容),提供超大文件高性能的diff解決方案。

和xdelta3.1的對比:

(xdelta3.1在diff "gcc-src..." 時失效, 添加 -B 530000000 參數後diff結果才正常,輸出 11787978字節,但這時使用了2639MB內存!)

HDiff2.1.3 diff used create_compressed_diff_stream()+bzip2,kMatchBlockSize=128, all data use 
         file stream;  patch used patch_decompress(), all data use file stream.
xdelta3.1 diff run by: -e -s old_file new_file delta_file   
         patch run by: -d -s old_file delta_file decoded_new_file
(note fix: xdelta3.1 diff "gcc-src..." fail, add -B 530000000 diff ok,
    out 14173073B and used 1070MB memory.)
===================================================================================================
   Program              diff       run time(Second)  memory(MB)   patch run time(Second) memory(MB)
                  xdelta3   HDiff   xdelta3 HDiff  xdelta3 HDiff  xdelta3 HPatch2.0  xdelta3 HPatch
---------------------------------------------------------------------------------------------------
apache-maven...   116265     83408    0.16   0.13     65    11      0.07    0.06         12     6
httpd bin...     2174098   2077625    1.1    1.2     157    15      0.25    0.65         30     8
httpd src...     2312990   2034666    1.3    1.7     185    15      0.30    0.91         50     8
Firefox...      28451567  27504156   16     11       225    16      2.0     4.1         100     8
emacs...        31655323  12033450   19      9.4     220    33      3.2     4.0          97    10
eclipse          1590860   1636221    1.5    1.2     207    34      0.46    0.49         77     8 
gcc-src...     107003829  12305741   56     19       224    79      9.7     9.5         102    11 
           (fix 14173073)
---------------------------------------------------------------------------------------------------
Average           12.18%    7.81%    100%  79.0%     100%  15.5%    100%  169.1%       100%  18.9%
              (fix 9.78%)
===================================================================================================

HDiff2.1.3 diff used create_compressed_diff_stream()+lzma,kMatchBlockSize=64, all data use 
         file stream;   patch used patch_decompress(), all data use file stream.
xdelta3.1 diff run by: -S lzma -9 -s old_file new_file delta_file   
         patch run by: -d -s old_file delta_file decoded_new_file
(note fix: xdelta3.1 diff "gcc-src..." fail, add -B 530000000 diff ok,
    out 11787978B and used 2639MB memory.)
===================================================================================================
   Program              diff       run time(Second)  memory(MB)   patch run time(Second) memory(MB)
                  xdelta3   HDiff    xdelta3 HDiff  xdelta3 HDiff  xdelta3 HPatch2.0 xdelta3 HPatch
---------------------------------------------------------------------------------------------------
apache-maven...    98434     83668     0.37   0.29    220    24      0.04    0.06        12     5
httpd bin...     1986880   1776553     2.5    2.9     356    59      0.24    0.52        30     8
httpd src...     2057118   1794029     3.3    4.2     375    62      0.28    0.78        50     8
Firefox...      27046727  21882343    27     32       416    76      1.8     2.2        100     9
emacs...        29392254   9698236    38     32       413    97      3.1     2.9         97     9
eclipse          1580342   1589045     3.0    1.9     399    76      0.48    0.48        77     6 
gcc-src...      95991977   9118368   128     44       417   148      8.9     8.6        102    11 
           (fix 11787978)
---------------------------------------------------------------------------------------------------
Average           11.24%    6.44%     100%  88.9%     100%  20.0%    100%  151.1%      100%  17.3%
              (fix 9.06%)
===================================================================================================


HDiff比BsDiff 生成的diff數據更小(一般小15%以上),速度更快(一般只需要1/4多時間),佔用內存更小(一般只需要2/3內存);   
   而在資源限制可能性更大的patch階段,HPatch比BsPatch運行的更快(幾分之一時間),內存佔用大多時候也小的多!  (hpatch也支持限制內存佔用的調用模式,速度會慢些)

HDiff新函數比xdelta 可以看到在兩種參數下都得到了類似的對比結果:
   diff的結果小很多(約25%以上),執行時間相當,但內存佔用只有1/5以下;
   patch時執行時間比xdelta慢(約50%以上),但內存佔用只有1/5以下;(hpatch也支持使用較多內存的調用模式,見和bspatch的對比測試,速度會比xdelta快很多)    

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章