HDiffPatch和BsDiff4.3&xdelta3.1的对比测试

HDiffPatch和BSDiff4.3&xdelta3.1的对比测试

作者: [email protected]   2013.06.06

 

tag: HDiffPatch,HDiff,HPatch,diff,patch,bsdiff,bspatch,xdelta,补丁,升级,差异

 

更新:2017.08.27

测试环境:  系统:macOS10.12.6 编译器:Xcode8.3.3 x64 CPU:i7 2.5G(turbo3.7G,6MB共享L3缓存) 内存: 8G*2 DDR3 1600MHz

  (x64模式编译BsDiff4.3时, 手动将后缀数组索引类型int64修改为int32,这样运行更快一些,而且内存占用减半!)

(HDiffPatch的说明和源代码下载: http://blog.csdn.net/housisong/article/details/9003013 )

和BsDiff的对比:  (文件数据都加载到内存)

HDiff2.0 diff used create_compressed_diff() + bzip2 | lzma | zlib , all data in memory;
         patch used patch_decompress() + load oldFile data into memory, other data use file stream.
BsDiff4.3 with bzip2 and all data in memory;
(when compiling BsDiff4.3-x64, suffix string index type int64 changed to int32, 
   faster and memroy requires to be halved.)   
===================================================================================================
         Program               Uncompressed Compressed Compressed BsDiff4.3        HDiff2.0
(newVersion<--oldVersion)           (tar)     (bzip2)   (lzma)    (bzip2)   (bzip2   lzma     zlib)
---------------------------------------------------------------------------------------------------
apache-maven-2.2.1-src <--2.0.11    5150720   1213258   1175464   115723    83935    80997    91921
httpd_2.4.4-netware-bin <--2.2.24  22612480   4035904   3459747  2192308  1809555  1616435  1938953
httpd-2.4.4-src <-- 2.2.24         31809536   4775534   4141266  2492534  1882555  1717468  2084843
Firefox-21.0-mac-en-US.app<--20.0  98740736  39731352  33027837 16454403 15749937 14018095 15417854
emacs-24.3 <-- 23.4               185528320  42044895  33707445 12892536  9574423  8403235 10964939
eclipse-java-juno-SR2-macosx
  -cocoa-x86_64 <--x86_32         178595840 156054144 151542885  1595465  1587747  1561773  1567700
gcc-src-4.8.0 <--4.7.0            552775680  86438193  64532384 11759496  8433260  7288783  9445004
---------------------------------------------------------------------------------------------------
Average Compression                 100.00%    31.76%    28.47%    6.64%    5.58%    5.01%    5.86%
===================================================================================================

===================================================================================================
   Program   run time(Second)   memory(MB)       run time(Second)              memory(MB)
               BsDiff HDiff   BsDiff  HDiff   BsPatch      HPatch2.0       BsPatch    HPatch2.0
              (bzip2)(bzip2)  (bzip2)(bzip2)  (bzip2) (bzip2  lzma  zlib)  (bzip2)(bzip2 lzma zlib)
---------------------------------------------------------------------------------------------------
apache-maven...  1.3   0.4       42     28      0.09    0.04  0.03  0.02      14      8     7     6
httpd bin...     8.6   3.0      148    124      0.72    0.36  0.18  0.13      50     24    23    18
httpd src...    20     5.1      322    233      0.99    0.46  0.24  0.17      78     44    42    37
Firefox...      94    28        829    582      3.0     2.2   1.2   0.57     198    106   106    94
emacs...       109    32       1400   1010      4.9     2.3   1.1   0.78     348    174   168   161
eclipse        100    33       1500   1000      1.5     0.56  0.57  0.50     350    176   174   172
gcc-src...     366    69       4420   3030      7.9     3.5   2.1   1.85    1020    518   517   504
---------------------------------------------------------------------------------------------------
Average        100%   28.9%    100%   71.5%     100%   52.3% 29.9% 21.3%    100%  52.3% 50.3% 45.5%
===================================================================================================

 

对于GB为单位的超大文件来说,hdiff和bsdiff算法的内存占用和时间都很难让人接受;这时就需要一种能够控制内存占用和时间可接受的算法,为此hdiff库提供了一个新的create_compressed_diff_stream()函数来解决这个问题(与create_compressed_diff()的输出兼容),提供超大文件高性能的diff解决方案。

和xdelta3.1的对比:

(xdelta3.1在diff "gcc-src..." 时失效, 添加 -B 530000000 参数后diff结果才正常,输出 11787978字节,但这时使用了2639MB内存!)

HDiff2.1.3 diff used create_compressed_diff_stream()+bzip2,kMatchBlockSize=128, all data use 
         file stream;  patch used patch_decompress(), all data use file stream.
xdelta3.1 diff run by: -e -s old_file new_file delta_file   
         patch run by: -d -s old_file delta_file decoded_new_file
(note fix: xdelta3.1 diff "gcc-src..." fail, add -B 530000000 diff ok,
    out 14173073B and used 1070MB memory.)
===================================================================================================
   Program              diff       run time(Second)  memory(MB)   patch run time(Second) memory(MB)
                  xdelta3   HDiff   xdelta3 HDiff  xdelta3 HDiff  xdelta3 HPatch2.0  xdelta3 HPatch
---------------------------------------------------------------------------------------------------
apache-maven...   116265     83408    0.16   0.13     65    11      0.07    0.06         12     6
httpd bin...     2174098   2077625    1.1    1.2     157    15      0.25    0.65         30     8
httpd src...     2312990   2034666    1.3    1.7     185    15      0.30    0.91         50     8
Firefox...      28451567  27504156   16     11       225    16      2.0     4.1         100     8
emacs...        31655323  12033450   19      9.4     220    33      3.2     4.0          97    10
eclipse          1590860   1636221    1.5    1.2     207    34      0.46    0.49         77     8 
gcc-src...     107003829  12305741   56     19       224    79      9.7     9.5         102    11 
           (fix 14173073)
---------------------------------------------------------------------------------------------------
Average           12.18%    7.81%    100%  79.0%     100%  15.5%    100%  169.1%       100%  18.9%
              (fix 9.78%)
===================================================================================================

HDiff2.1.3 diff used create_compressed_diff_stream()+lzma,kMatchBlockSize=64, all data use 
         file stream;   patch used patch_decompress(), all data use file stream.
xdelta3.1 diff run by: -S lzma -9 -s old_file new_file delta_file   
         patch run by: -d -s old_file delta_file decoded_new_file
(note fix: xdelta3.1 diff "gcc-src..." fail, add -B 530000000 diff ok,
    out 11787978B and used 2639MB memory.)
===================================================================================================
   Program              diff       run time(Second)  memory(MB)   patch run time(Second) memory(MB)
                  xdelta3   HDiff    xdelta3 HDiff  xdelta3 HDiff  xdelta3 HPatch2.0 xdelta3 HPatch
---------------------------------------------------------------------------------------------------
apache-maven...    98434     83668     0.37   0.29    220    24      0.04    0.06        12     5
httpd bin...     1986880   1776553     2.5    2.9     356    59      0.24    0.52        30     8
httpd src...     2057118   1794029     3.3    4.2     375    62      0.28    0.78        50     8
Firefox...      27046727  21882343    27     32       416    76      1.8     2.2        100     9
emacs...        29392254   9698236    38     32       413    97      3.1     2.9         97     9
eclipse          1580342   1589045     3.0    1.9     399    76      0.48    0.48        77     6 
gcc-src...      95991977   9118368   128     44       417   148      8.9     8.6        102    11 
           (fix 11787978)
---------------------------------------------------------------------------------------------------
Average           11.24%    6.44%     100%  88.9%     100%  20.0%    100%  151.1%      100%  17.3%
              (fix 9.06%)
===================================================================================================


HDiff比BsDiff 生成的diff数据更小(一般小15%以上),速度更快(一般只需要1/4多时间),占用内存更小(一般只需要2/3内存);   
   而在资源限制可能性更大的patch阶段,HPatch比BsPatch运行的更快(几分之一时间),内存占用大多时候也小的多!  (hpatch也支持限制内存占用的调用模式,速度会慢些)

HDiff新函数比xdelta 可以看到在两种参数下都得到了类似的对比结果:
   diff的结果小很多(约25%以上),执行时间相当,但内存占用只有1/5以下;
   patch时执行时间比xdelta慢(约50%以上),但内存占用只有1/5以下;(hpatch也支持使用较多内存的调用模式,见和bspatch的对比测试,速度会比xdelta快很多)    

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章