lz77算法 例子

摘自 https://en.wikipedia.org/wiki/LZ77_and_LZ78

Example[edit]

The calculation of the LZ77-based factorization of the string aacaacabcabaaac illustrated.

The table shows the calculation of the LZ77 factorization using a dictionary buffer of size 12 and a preview buffer of size 9. In the far right column is from top to bottom read the output of the algorithm (0, 0, "a") (1, 1, "c") (3, 4, "b") (3, 3, "a") (12, 3, "$"). The position is relative to the right edge of the dictionary buffer, this must be considered when decoding.

The buffers operate on the principle of a sliding window, i.e. to be compressed data stream is pushed right into the buffer. As noted in the algorithm, the shift is to the length of the match found in the dictionary, and a further position. This means that redundant triples be avoided as new characters are usually always taken individually in the dictionary. In the example, so the third triple (0, 0, "c") should be incorporated, what the compression ratio, however, deteriorated significantly. The matches are green and marked to be moved string in red. It is important to note that more and more a character is shifted, was found to be in accordance to new characters do not have to double encode.

Example of a LZ77 compression sliding window

Example of a LZ77 compression sliding window
Line 12 11 10  9  8  7  6  5  4  3  2  1  0  1  2  3  4  5  6  7  8  9   Output
1 (Empty) a a c a a c a b c a \Longrightarrow (0,0,"a")
2 (Empty) a a c a a c a b c a b \Longrightarrow (1,1,"c")
3 (Empty) a a c a a c a b c a b a a \Longrightarrow (3,4,"b")
4 (Empty) a a c a a c a b c a b a a a c (Empty) \Longrightarrow (3,3,"a")
5 a a c a a c a b c a b a a a c (Empty) \Longrightarrow (12,3,"$")
finished

The first popular characters is unknown, so that the first "a" is added to (0, 0, "a"). In the 2nd line "a" can already be read from the dictionary buffer (marked in green) so that "c" is accepted as the new character. In the 3rd line is a special case of the LZ77 algorithm can be seen as the matching string extends into the preview window, shown in the example by green text on a red background. Line 4 and 5 are equivalent to deal with the first two. Except that last a triple $ is next inserted character, since the text is fully compressed and there is no next character

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章