redis基本數據結構之壓縮列表

       壓縮列表(ziplist)時列表鍵和哈希鍵的底層實現之一。壓縮列表時redis爲了節約內存而開發的,是由一系列特殊編碼的連續內存塊組成的順序型(sequential)數據結構。一個壓縮列表可以包含任意多個節點(entry),每個節點可以保存一個字節數組或者一個整數值。

     接下來給大家展示以下壓縮列表的具體結構:


在表中列出了空的壓縮列表和非空的壓縮列表的具體是如何存儲的。

   下面我們來看redis是如何對壓縮列表定義的:


/* The ziplist is a specially encoded dually linked list that is designed
 * to be very memory efficient. 
 * 
 * Ziplist 是爲了儘可能地節約內存而設計的特殊編碼雙端鏈表。
 *
 * It stores both strings and integer values,
 * where integers are encoded as actual integers instead of a series of
 * characters. 
 *
 * Ziplist可以存儲字符串值和整數值,
 * 其中,整數值被保存爲整數,而不是字符數組。
 *
 * It allows push and pop operations on either side of the list
 * in O(1) time. However, because every operation requires a reallocation of
 * the memory used by the ziplist, the actual complexity is related to the
 * amount of memory used by the ziplist.
 *
 *  Ziplist允許在列表的兩端進行o(1)複雜度的push和pop操作。
 *  但是,因爲這些操作都需要對整個Ziplist進行內存分配
 *  所以實際的複雜度和ziplist佔用的內存大小有關。
 * ----------------------------------------------------------------------------
 *
 * ZIPLIST OVERALL LAYOUT:
 * Ziplist整體佈局:
 *
 * The general layout of the ziplist is as follows:
 * 以下是ziplist的一般佈局:
 *
 * <zlbytes><zltail><zllen><entry><entry><zlend>
 *
 * <zlbytes> is an unsigned integer to hold the number of bytes that the
 * ziplist occupies. This value needs to be stored to be able to resize the
 * entire structure without the need to traverse it first.
 *
 * <zlbytes>是一個無符號整數,保存着ziplist使用的內存數量。
 * 通過這個值,程序員可以直接對ziplist的內存大小進行調整
 *
 * <zltail> is the offset to the last entry in the list. This allows a pop
 * operation on the far side of the list without the need for full traversal.
 *
 * <zltail>保存着到達列表中最後一個節點的偏移量
 * 這個偏移量使得對錶尾的pop操作可以在無需遍歷整個列表的情況下進行。
 *
 * <zllen> is the number of entries.When this value is larger than 2**16-2,
 * we need to traverse the entire list to know how many items it holds.
 *
 * <zllen>保存着列表中的節點的數量,
 * 當zllen保存的值大於65536時,
 * 程序需要遍歷整個列表才能知道列表實際包含了多少個節點
 *
 * <zlend> is a single byte special value, equal to 255, which indicates the
 * end of the list.
 * 
 * <zlend>的長度爲1字節,值爲255,標識列表的末尾
 *
 * ZIPLIST ENTRIES:
 * ZIPLIST 節點:
 *
 * Every entry in the ziplist is prefixed by a header that contains two pieces
 * of information. First, the length of the previous entry is stored to be
 * able to traverse the list from back to front. Second, the encoding with an
 * optional string length of the entry itself is stored.
 *
 * 每個ziplist節點的前面都帶有一個header,這個header包含兩部分信息:
 *   (1)前置節點的長度,在程序從後向前遍歷時使用,
 *   (2)當前節點所保存的值的類型和長度
 *
 * The length of the previous entry is encoded in the following way:
 * If this length is smaller than 254 bytes, it will only consume a single
 * byte that takes the length as value. When the length is greater than or
 * equal to 254, it will consume 5 bytes. The first byte is set to 254 to
 * indicate a larger value is following. The remaining 4 bytes take the
 * length of the previous entry as value.
 *
 * 編碼前置節點的長度的方法如下:
 *    (1)如果前置節點的長度小於254字節,那麼程序將使用1個字節來保存這個長度
 *    (2)如果前置節點的長度大於等於254字節,那麼程序將使用5個字節來保存這個長度值:
 *        (a)第一個字節的值被設爲254,用於標識這是一個5字節長的長度值
 *        (b)之後的4個字節則用於保存前置節點的實際長度。
 *
 * The other header field of the entry itself depends on the contents of the
 * entry. When the entry is a string, the first 2 bits of this header will hold
 * the type of encoding used to store the length of the string, followed by the
 * actual length of the string. When the entry is an integer the first 2 bits
 * are both set to 1. The following 2 bits are used to specify what kind of
 * integer will be stored after this header. An overview of the different
 * types and encodings is as follows:
 *
 * header另一部分的內容和節點所保存的值有關。
 *
 * (1)如果節點保存的是字符串值,
 *      那麼這部分的header的頭2個位保存編碼字符串長度所使用的類型,
 *      而之後跟着的內容則是字符串的實際長度
 *
 * |00pppppp| - 1 byte
 *      String value with length less than or equal to 63 bytes (6 bits).
 *      字符串的長度小於或等於63字節。
 * |01pppppp|qqqqqqqq| - 2 bytes
 *      String value with length less than or equal to 16383 bytes (14 bits).
 *      字符串的長度小於或等於16383字節
 * |10______|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes
 *      String value with length greater than or equal to 16384 bytes.
 *      字符串的長度大於或等於16384個字節。
 *
 * (2)如果節點保存的是整數,
 *      那麼這部分header的頭2位都被設置爲1,
 *      而之後跟着的2位則用於標識所保存的整數的類型。
 *
 * |11000000| - 1 byte
 *      Integer encoded as int16_t (2 bytes).
 *      節點的值爲int16_t類型的整數,長度爲2個字節
 * |11010000| - 1 byte
 *      Integer encoded as int32_t (4 bytes).
 *      節點的值爲int32_t類型的整數,長度爲4個字節
 * |11100000| - 1 byte
 *      Integer encoded as int64_t (8 bytes).
 *      節點的值爲int64_t類型的整數,長度爲8個字節
 * |11110000| - 1 byte
 *      Integer encoded as 24 bit signed (3 bytes).
 *      節點的值爲24位(3字節)長的整數
 * |11111110| - 1 byte
 *      Integer encoded as 8 bit signed (1 byte).
 *      節點的值爲8位(1字節)長的整數
 * |1111xxxx| - (with xxxx between 0000 and 1101) immediate 4 bit integer.
 *      Unsigned integer from 0 to 12. The encoded value is actually from
 *      1 to 13 because 0000 and 1111 can not be used, so 1 should be
 *      subtracted from the encoded 4 bit value to obtain the right value.
 *      節點的值介於0至12之間的無符號數,
 *      因爲0000和1111都不能使用,所以位的實際值將是1至13。
 *      程序在取得這4位的值之後,還需要減去1,才能計算出正確的值。
 *      比如說,如果位的值位爲0001 = 1,那麼程序返回的值將是1- 1=0。
 * |11111111| - End of ziplist.
 *      ziplist的結尾標誌。
 * All the integers are represented in little endian byte order.
 *
 * 所有的整數都表示爲小端字節序。
 * ----------------------------------------------------------------------------
 *
 * Copyright (c) 2009-2012, Pieter Noordhuis <pcnoordhuis at gmail dot com>
 * Copyright (c) 2009-2012, Salvatore Sanfilippo <antirez at gmail dot com>
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions are met:
 *
 *   * Redistributions of source code must retain the above copyright notice,
 *     this list of conditions and the following disclaimer.
 *   * Redistributions in binary form must reproduce the above copyright
 *     notice, this list of conditions and the following disclaimer in the
 *     documentation and/or other materials provided with the distribution.
 *   * Neither the name of Redis nor the names of its contributors may be used
 *     to endorse or promote products derived from this software without
 *     specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
 * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
 * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 * POSSIBILITY OF SUCH DAMAGE.
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <limits.h>
#include "zmalloc.h"
#include "util.h"
#include "ziplist.h"
#include "endianconv.h"
#include "redisassert.h"

/*
 * ziplist末端標識符,以及5字節標識符
 *
 *
 * */
#define ZIP_END 255
#define ZIP_BIGLEN 254

/* Different encoding/length possibilities */
/*
 * 字符串編碼和整數編碼的掩碼
 *
 * */
#define ZIP_STR_MASK 0xc0
#define ZIP_INT_MASK 0x30

/*
 * 字符串編碼類型
 *
 * */
//編碼長度1字節,長度小於或等於63字節的字節數組
#define ZIP_STR_06B (0 << 6)
//編碼長度爲2字節,長度小於等於16383字節的字節數組
#define ZIP_STR_14B (1 << 6)
//編碼長度爲5字節,長度小於等於4294967295的字節數組
#define ZIP_STR_32B (2 << 6)

/*
 * 整數編碼類型
 *
 *
 * */
//編碼長度1字節,int16_t類型的整數
#define ZIP_INT_16B (0xc0 | 0<<4)
//編碼長度1字節,int32_t類型的整數
#define ZIP_INT_32B (0xc0 | 1<<4)
//編碼長度1字節,int64_t類型的整數
#define ZIP_INT_64B (0xc0 | 2<<4)
//編碼長度1字節,24位有符號的整數
#define ZIP_INT_24B (0xc0 | 3<<4)
//8位有符號整數
#define ZIP_INT_8B 0xfe


/* 4 bit integer immediate encoding */
/*
 * 4位整數編碼的掩碼和類型
 *
 * */
#define ZIP_INT_IMM_MASK 0x0f
#define ZIP_INT_IMM_MIN 0xf1    /* 11110001 */
#define ZIP_INT_IMM_MAX 0xfd    /* 11111101 */
#define ZIP_INT_IMM_VAL(v) (v & ZIP_INT_IMM_MASK)

/*
 * 24位整數的最大值和最小值
 * */
#define INT24_MAX 0x7fffff
#define INT24_MIN (-INT24_MAX - 1)

/* Macro to determine type */
/*
 * 查看給定編碼enc是否字符串編碼
 *
 * */
#define ZIP_IS_STR(enc) (((enc) & ZIP_STR_MASK) < ZIP_STR_MASK)

/* Utility macros */
/*
 * ziplist屬性宏
 *
 * */
//定位到ziplist的bytes屬性,該屬性記錄了整個ziplist所佔用的內存字節數
//用於取出bytes屬性的現有值,或者bytes屬性賦予新值
#define ZIPLIST_BYTES(zl)       (*((uint32_t*)(zl)))
//定位到ziplist的offset屬性,該屬性記錄了到達表尾節點的偏移量
//用於取出offset屬性的現有值,或者爲offset屬性賦予新值
#define ZIPLIST_TAIL_OFFSET(zl) (*((uint32_t*)((zl)+sizeof(uint32_t))))
//定位到ziplist的length屬性,該屬性記錄了ziplist包含的節點數量
//用於取出length屬性的現有值,或者爲length屬性賦予新值
#define ZIPLIST_LENGTH(zl)      (*((uint16_t*)((zl)+sizeof(uint32_t)*2)))
//返回ziplist表頭的大小
#define ZIPLIST_HEADER_SIZE     (sizeof(uint32_t)*2+sizeof(uint16_t))
//返回指向ziplist第一個節點(的起始位置)的指針
#define ZIPLIST_ENTRY_HEAD(zl)  ((zl)+ZIPLIST_HEADER_SIZE)
//返回指向ziplist的最後一個節點(的起始位置)的指針
#define ZIPLIST_ENTRY_TAIL(zl)  ((zl)+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)))
//返回指向ziplist末端ZIP_END(的起始位置)的指針
#define ZIPLIST_ENTRY_END(zl)   ((zl)+intrev32ifbe(ZIPLIST_BYTES(zl))-1)


 上述主要對壓縮列表的各個屬性做了解釋,還有對於字符串和整數的編碼,定義了redis的屬性宏。。。

    下面我們來看壓縮列表節點的定義:

typedef struct zlentry {
    //prevrawlen:前置節點的長度
    //prevrawlensize:編碼prevrawlen所需的字節大小
    unsigned int prevrawlensize, prevrawlen;
    //len:當前節點的長度
    //lensize:編碼len所需的字節大小
    unsigned int lensize, len;
    //當前節點header的大小
    //等於prevrawlensize + lensize
    unsigned int headersize;
    //當前節點所使用的編碼類型
    unsigned char encoding;
    //指向當前節點的指針
    unsigned char *p;
} zlentry;

/* Extract the encoding from the byte pointed by 'ptr' and set it into
 * 'encoding'. */
/*
 * 從ptr中取出節點值的編碼類型,並將它保存在encoding變量中。
 *
 * */
#define ZIP_ENTRY_ENCODING(ptr, encoding) do {  \
    (encoding) = (ptr[0]); \
    if ((encoding) < ZIP_STR_MASK) (encoding) &= ZIP_STR_MASK; \
} while(0)
/*
 * 增加ziplist的節點數
 *
 *
 * */
#define ZIPLIST_INCR_LENGTH(zl,incr) { \
    if (ZIPLIST_LENGTH(zl) < UINT16_MAX) \
        ZIPLIST_LENGTH(zl) = intrev16ifbe(intrev16ifbe(ZIPLIST_LENGTH(zl))+incr); \
}
上述就是redis對於壓縮列表節點的定義。

  下面我們來看redis對於一下接口的具體實現:

/* Return bytes needed to store integer encoded by 'encoding' */
/*
 * 返回保存encoding編碼的值所需要的字節數量
 *
 * */
static unsigned int zipIntSize(unsigned char encoding) {
    switch(encoding) {
    case ZIP_INT_8B:  return 1;
    case ZIP_INT_16B: return 2;
    case ZIP_INT_24B: return 3;
    case ZIP_INT_32B: return 4;
    case ZIP_INT_64B: return 8;
    default: return 0; /* 4 bit immediate */
    }
    assert(NULL);
    return 0;
}

/* Encode the length 'rawlen' writing it in 'p'. If p is NULL it just returns
 * the amount of bytes required to encode such a length. */
/*
 * 編碼節點的長度值爲l,並將它寫入到p中,然後返回編碼l所需的字節數量。
 *
 * 如果p爲NULL,那麼僅返回編碼l所需的字節數量,不進行寫入
 *
 * */
static unsigned int zipEncodeLength(unsigned char *p, unsigned char encoding, unsigned int rawlen) {
    unsigned char len = 1, buf[5];
    //判斷encoding是否爲字符串編碼
    if (ZIP_IS_STR(encoding)) {
        /* Although encoding is given it may not be set for strings,
         * so we determine it here using the raw length. */
        if (rawlen <= 0x3f) {
            //rawlen 長度小於等於63字節的字節數組
            //編碼長度爲1個字節
            if (!p) return len;
            //程序執行到這一步時,說明p不爲空,要將編碼節點的長度值寫入p中
            //ZIP_STR_06B == 00bbbbbb 
            buf[0] = ZIP_STR_06B | rawlen;
        } else if (rawlen <= 0x3fff) {
            //rawlen長度小於等於16383字節的字節數組
            //編碼長度爲2個字節
            len += 1;
            if (!p) return len;
            buf[0] = ZIP_STR_14B | ((rawlen >> 8) & 0x3f);
            buf[1] = rawlen & 0xff;
        } else {
            //否則,長度小於等於4294967295的字節數組
            //編碼長度爲5個字節
            len += 4;
            if (!p) return len;
            buf[0] = ZIP_STR_32B;
            buf[1] = (rawlen >> 24) & 0xff;
            buf[2] = (rawlen >> 16) & 0xff;
            buf[3] = (rawlen >> 8) & 0xff;
            buf[4] = rawlen & 0xff;
        }
    } else {
        //編碼整數
        /* Implies integer encoding, so length is always 1. */
        if (!p) return len;
        buf[0] = encoding;
    }

    /* Store this length at p */
    //將編碼的長度寫入p
    memcpy(p,buf,len);
    //返回編碼所需的字節數
    return len;
}

/* Decode the length encoded in 'ptr'. The 'encoding' variable will hold the
 * entries encoding, the 'lensize' variable will hold the number of bytes
 * required to encode the entries length, and the 'len' variable will hold the
 * entries length. 
 *
 * 解碼ptr指針,取出列表節點的相關信息,並將它們保存在以下變量中:
 *
 *  -encoding 保存節點值的編碼類型
 *
 *  -lensize 保存編碼節點長度所需的字節數
 *
 *  -len 保存節點所需的長度
 *
 * */
#define ZIP_DECODE_LENGTH(ptr, encoding, lensize, len) do {                    \
    //取出節點值的編碼類型,並保存到encoding變量中
    ZIP_ENTRY_ENCODING((ptr), (encoding));                                     \
    if ((encoding) < ZIP_STR_MASK) {                                           \
        //字符串編碼
        if ((encoding) == ZIP_STR_06B) {                                       \
            //如果編碼爲00bbbbbb
            //編碼節點長度所需要的字節數爲1
            (lensize) = 1;                                                     \
            //保存節點所需的長度
            (len) = (ptr)[0] & 0x3f;                                           \
        } else if ((encoding) == ZIP_STR_14B) {                                \
            //如果編碼爲01bbbbbb
            //編碼節點長度所需要的字節數爲2
            (lensize) = 2;                                                     \
            //保存節點所需的長度
            (len) = (((ptr)[0] & 0x3f) << 8) | (ptr)[1];                       \
        } else if (encoding == ZIP_STR_32B) {                                  \
            //如果編碼爲10_ _ _ _ _ _ _ 
            //編碼節點長度所需要的字節數爲5
            (lensize) = 5;                                                     \
            (len) = ((ptr)[1] << 24) |                                         \
                    ((ptr)[2] << 16) |                                         \
                    ((ptr)[3] <<  8) |                                         \
                    ((ptr)[4]);                                                \
        } else {                                                               \
            //否則直接退出
            assert(NULL);                                                      \
        }                                                                      \
    } else {                                                                   \
        //編碼整數
        (lensize) = 1;                                                         \
        //保存encoding編碼的值所需的字節長度
        (len) = zipIntSize(encoding);                                          \

    }                                                                          \
} while(0);

/* Encode the length of the previous entry and write it to "p". Return the
 * number of bytes needed to encode this length if "p" is NULL. 
 *
 * 對前置節點的長度len進行編碼,並將它寫入到p中,
 * 然後返回編碼len所需的字節數量
 *
 * 如果p爲空,那麼不進行寫入,僅返回編碼len所需的字節數量
 *
 * */
static unsigned int zipPrevEncodeLength(unsigned char *p, unsigned int len) {
    if (p == NULL) {
        //p爲空時,僅返回編碼len所需的字節數量
        //如果len < 254時,編碼len僅需要1個字節
        return (len < ZIP_BIGLEN) ? 1 : sizeof(len)+1;
    } else {
        //當p不爲空時
        if (len < ZIP_BIGLEN) {
            //(1)如果len < 254時
            p[0] = len;
            return 1;
        } else {
            //(2)如果len > 254時
            //第一個字節被設置爲254
            p[0] = ZIP_BIGLEN;
            //將len的值拷貝到字符串從p+1開始的位置
            memcpy(p+1,&len,sizeof(len));
            //大小端進行轉換
            memrev32ifbe(p+1);
            //返回編碼長度
            return 1+sizeof(len);
        }
    }
}

/* Encode the length of the previous entry and write it to "p". This only
 * uses the larger encoding (required in __ziplistCascadeUpdate). 
 *
 * 將原本只需要1個字節來保存的前置節點長度len編碼至一個5字節長的header中
 *
 * */
static void zipPrevEncodeLengthForceLarge(unsigned char *p, unsigned int len) {
    //如果p爲空,則直接返回
    if (p == NULL) return;
    //當p不爲空時,首先將p[0]設置爲254,用於標識5字節長度標識
    p[0] = ZIP_BIGLEN;
    //內存拷貝,將len寫入字符串p + 1的位置
    memcpy(p+1,&len,sizeof(len));
    //大小端的轉換
    memrev32ifbe(p+1);
}

/* Decode the number of bytes required to store the length of the previous
 * element, from the perspective of the entry pointed to by 'ptr'.
 *
 * 解碼ptr指針,取出解碼前置節點長度的字節數,並將它保存到prevlensize變量中。
 *
 * */
#define ZIP_DECODE_PREVLENSIZE(ptr, prevlensize) do {                          \
    if ((ptr)[0] < ZIP_BIGLEN) {                                               \
        //當ptr[0]小於254時,說明前置節點的字節數爲1
        (prevlensize) = 1;                                                     \
    } else {                                                                   \
        //否則說明前置節點的字節數爲5
        (prevlensize) = 5;                                                     \

    }                                                                          \
} while(0);

/* Decode the length of the previous element, from the perspective of the entry
 * pointed to by 'ptr'. 
 *
 * 解碼pre指針,
 * 取出編碼前置節點長度所需的字節數,並將它保存到prevlensize變量中。
 *
 * 然後根據prevlensize,從ptr中取出前置節點的長度值,
 * 並將這個長度保存到prevlen變量中。
 * */
#define ZIP_DECODE_PREVLEN(ptr, prevlensize, prevlen) do {                     \
    //獲取編碼前置節點長度所需的字節數,並將其保存在prevlensize中
    ZIP_DECODE_PREVLENSIZE(ptr, prevlensize);                                  \
    if ((prevlensize) == 1) {                                                  \
        //如果所需字節數爲1,則說明其長度值小於254,僅用一個字節就可以編碼
        (prevlen) = (ptr)[0];                                                  \
    } else if ((prevlensize) == 5) {                                           \
        //當所需字節數爲5時,則說明其長度大於等於254。
        assert(sizeof((prevlensize)) == 4);                                    \
        //如果前置節點的字節數爲4時,將ptr從下標爲1到4的內容拷貝到prevlen內
        memcpy(&(prevlen), ((char*)(ptr)) + 1, 4);                             \
        //進行大小端轉換
        memrev32ifbe(&prevlen);                                                \
    }                                                                          \
} while(0);

/* Return the difference in number of bytes needed to store the length of the
 * previous element 'len', in the entry pointed to by 'p'. 
 *
 * 計算編碼新的前置節點長度len所需的字節數
 * 減去編碼p原來的前置節點長度所需的字節數之差
 * */
static int zipPrevLenByteDiff(unsigned char *p, unsigned int len) {
    unsigned int prevlensize;
    //取出前置節點長度所需的字節數,並將它保存在prevlensize變量中
    ZIP_DECODE_PREVLENSIZE(p, prevlensize);
    //計算編碼len所需的字節數,然後進行減法
    return zipPrevEncodeLength(NULL, len) - prevlensize;
}

/* Return the total number of bytes used by the entry pointed to by 'p'. 
 *
 * 返回指針p所指向的節點佔用的字節數總和
 *
 * */
static unsigned int zipRawEntryLength(unsigned char *p) {
    unsigned int prevlensize, encoding, lensize, len;
    //解碼p指針,取出編碼前置節點長度所需的字節數,並將它保存到prevlensize中。
    ZIP_DECODE_PREVLENSIZE(p, prevlensize);
    //encoding:用於保存當前節點的編碼類型
    //lensize:用於保存當前節點長度所需的字節數
    //len:保存當前節點長度
    ZIP_DECODE_LENGTH(p + prevlensize, encoding, lensize, len);
    //返回佔用的總的字節總和
    return prevlensize + lensize + len;
}

/* Check if string pointed to by 'entry' can be encoded as an integer.
 * Stores the integer value in 'v' and its encoding in 'encoding'.
 *
 * 檢查entry中指定的字符串能否被編碼爲整數
 *
 * 如果可以的話,
 * 將編碼的整數保存在指針v的值中,並將編碼方式保存在指針encoding的值中。
 *
 *
 *
 * */
static int zipTryEncoding(unsigned char *entry, unsigned int entrylen, long long *v, unsigned char *encoding) {
    long long value;
    //忽略太長或太短的字符串
    if (entrylen >= 32 || entrylen == 0) return 0;
    //將一個字符串轉換爲long long 類型
    if (string2ll((char*)entry,entrylen,&value)) {
        /* Great, the string can be encoded. Check what's the smallest
         * of our encoding types that can hold this value. */
        //當轉換成功時,
        //以從大到小的順序檢查適合value的編碼方式
        if (value >= 0 && value <= 12) {
            *encoding = ZIP_INT_IMM_MIN+value;
        } else if (value >= INT8_MIN && value <= INT8_MAX) {
            *encoding = ZIP_INT_8B;
        } else if (value >= INT16_MIN && value <= INT16_MAX) {
            *encoding = ZIP_INT_16B;
        } else if (value >= INT24_MIN && value <= INT24_MAX) {
            *encoding = ZIP_INT_24B;
        } else if (value >= INT32_MIN && value <= INT32_MAX) {
            *encoding = ZIP_INT_32B;
        } else {
            *encoding = ZIP_INT_64B;
        }
        //使用指針記錄value的值
        *v = value;
        //返回轉換成功的標識
        return 1;
    }
    //轉換失敗
    return 0;
}

/* Store integer 'value' at 'p', encoded as 'encoding' 
 *
 * 以encoding指定的編碼方式,將整數值value寫入到p
 *
 * */
static void zipSaveInteger(unsigned char *p, int64_t value, unsigned char encoding) {
    int16_t i16;
    int32_t i32;
    int64_t i64;
    if (encoding == ZIP_INT_8B) {
        //當編碼方式爲8爲有符號的整數時
        //編碼長度只需要一個字節,
        ((int8_t*)p)[0] = (int8_t)value;
    } else if (encoding == ZIP_INT_16B) {
        //當編碼方式爲int32_t類型的整數時,
        //編碼長度爲1個字節
        //i16用於保存int16_t類型的整數
        i16 = value;
        //內存拷貝,將i16的值拷貝到字符串指針p中
        memcpy(p,&i16,sizeof(i16));
        //進行大小端轉換
        memrev16ifbe(p);
    } else if (encoding == ZIP_INT_24B) {
        //當編碼方式爲24位有符號數時,
        i32 = value<<8;
        memrev32ifbe(&i32);
        memcpy(p,((uint8_t*)&i32)+1,sizeof(i32)-sizeof(uint8_t));
    } else if (encoding == ZIP_INT_32B) {
        //當編碼方式爲int32_t類型的整數時
        i32 = value;
        //內存拷貝
        memcpy(p,&i32,sizeof(i32));
        //大小端轉換
        memrev32ifbe(p);
    } else if (encoding == ZIP_INT_64B) {
        //當編碼方式爲int64_t類型時,
        i64 = value;
        //內存拷貝
        memcpy(p,&i64,sizeof(i64));
        //大小端轉換
        memrev64ifbe(p);
    } else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) {
        /* Nothing to do, the value is stored in the encoding itself. */
        //否則,什麼也不做
    } else {
        assert(NULL);
    }
}

/* Read integer encoded as 'encoding' from 'p' 
 *
 * 以encoding指定的編碼方式,讀取並返回指針p中的整數值
 *
 * */
static int64_t zipLoadInteger(unsigned char *p, unsigned char encoding) {
    int16_t i16;
    int32_t i32;
    int64_t i64, ret = 0;
    if (encoding == ZIP_INT_8B) {
        //當編碼方式爲8位的有符號整數時,
        //其編碼長度爲1個字節
        //首先將指針p轉化爲int8_t的類型
        //然後將第一個字節裏的內容賦給ret
        ret = ((int8_t*)p)[0];
    } else if (encoding == ZIP_INT_16B) {
        //當編碼方式爲int16_t類型時,
        //進行內存拷貝
        memcpy(&i16,p,sizeof(i16));
        //大小端轉換
        memrev16ifbe(&i16);
        ret = i16;
    } else if (encoding == ZIP_INT_32B) {
        memcpy(&i32,p,sizeof(i32));
        memrev32ifbe(&i32);
        ret = i32;
    } else if (encoding == ZIP_INT_24B) {
        i32 = 0;
        memcpy(((uint8_t*)&i32)+1,p,sizeof(i32)-sizeof(uint8_t));
        memrev32ifbe(&i32);
        ret = i32>>8;
    } else if (encoding == ZIP_INT_64B) {
        memcpy(&i64,p,sizeof(i64));
        memrev64ifbe(&i64);
        ret = i64;
    } else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) {
        ret = (encoding & ZIP_INT_IMM_MASK)-1;
    } else {
        assert(NULL);
    }
    return ret;
}

/* Return a struct with all information about an entry. 
 *
 * 將p所指向的列表節點的信息全保存到zlentry中,並返回該zlentry
 *
 * */
static zlentry zipEntry(unsigned char *p) {
    zlentry e;
    //e.prevrawlensize保存着編碼前一個節點的長度所需的字節數
    //e.prevrawlen保存着前一個節點的長度
    //調用ZIP_DECODE_PREVLEN 對p進行解碼,取出編碼前置節點長度所需的字節數和長度值分別保存在e.prevrawlensize和e.prevrawlen中
    ZIP_DECODE_PREVLEN(p, e.prevrawlensize, e.prevrawlen);

    //調用ZIP_DECODE_LENGTH 對p進行解碼,
    //e.encoding 保存節點值的編碼類型
    //e.lensize保存編碼節點值長度所需的字節數
    //e.len保存節點值的長度
    ZIP_DECODE_LENGTH(p + e.prevrawlensize, e.encoding, e.lensize, e.len);
    //計算頭節點的字節數
    e.headersize = e.prevrawlensize + e.lensize;
    //記錄指針
    e.p = p;
    return e;
}

上述的接口主要是一些靜態的函數,只能在本文件內使用。

 下面我們看幾個操作壓縮列表的函數:

      (1)創建一個空的新的ziplist:

/* Create a new empty ziplist.
 *
 *創建並返回一個新的ziplist
 * */
unsigned char *ziplistNew(void) {


    //ZIPLIST_HEADER_SIZE是ziplist表頭的大小
    //1 字節是表末端ZIP_END的大小
    unsigned int bytes = ZIPLIST_HEADER_SIZE+1;
    
    //爲表頭和表尾分配空間
    unsigned char *zl = zmalloc(bytes);

    //初始化表的屬性
    ZIPLIST_BYTES(zl) = intrev32ifbe(bytes);
    ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(ZIPLIST_HEADER_SIZE);
    ZIPLIST_LENGTH(zl) = 0;

    //設置表末端
    zl[bytes-1] = ZIP_END;

    //返回空的新的ziplist
    return zl;
}

(2)壓縮列表的刪除:

/* Resize the ziplist.
 *
 * 調整ziplist的大小爲len個字節
 *
 *當ziplist原有的大小小於len時,擴展ziplist不會改變ziplist原有的元素
 * */
static unsigned char *ziplistResize(unsigned char *zl, unsigned int len) {


    //用zrealloc擴展時不改變現有元素
    zl = zrealloc(zl,len);

    //更新bytes屬性    
    ZIPLIST_BYTES(zl) = intrev32ifbe(len);

    //重新設置表末端
    zl[len-1] = ZIP_END;
    return zl;
}

/* When an entry is inserted, we need to set the prevlen field of the next
 * entry to equal the length of the inserted entry. It can occur that this
 * length cannot be encoded in 1 byte and the next entry needs to be grow
 * a bit larger to hold the 5-byte encoded prevlen. This can be done for free,
 * because this only happens when an entry is already being inserted (which
 * causes a realloc and memmove). However, encoding the prevlen may require
 * that this entry is grown as well. This effect may cascade throughout
 * the ziplist when there are consecutive entries with a size close to
 * ZIP_BIGLEN, so we need to check that the prevlen can be encoded in every
 * consecutive entry.
 * 
 * 當一個新節點添加到某個節點之前的時候,
 * 如果原節點的header空間不足以保存新節點的長度,
 * 那麼就需要對原節點的header空間進行擴展(從1字節擴展到5字節)。
 *
 * 但是,當對原節點進行擴展之後,原節點的下一個節點的prevlen可能出現空間不足,
 * 這種情況在多個連續節點的長度都接近於ZIP_BIGLEN時可能發生。
 *
 * 這個函數就用於檢查並修復後續節點的空間問題
 *
 * Note that this effect can also happen in reverse, where the bytes required
 * to encode the prevlen field can shrink. This effect is deliberately ignored,
 * because it can cause a "flapping" effect where a chain prevlen fields is
 * first grown and then shrunk again after consecutive inserts. Rather, the
 * field is allowed to stay larger than necessary, because a large prevlen
 * field implies the ziplist is holding large entries anyway.
 *
 * 反過來說,
 * 因爲節點的長度變小引起的連續縮小也可能出現,
 *
 * The pointer "p" points to the first entry that does NOT need to be
 * updated, i.e. consecutive fields MAY need an update. */
static unsigned char *__ziplistCascadeUpdate(unsigned char *zl, unsigned char *p) {

    //curlen用於記錄ziplist所佔用的內存的字節數
    size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), rawlen, rawlensize;
    size_t offset, noffset, extra;
    unsigned char *np;
    //定義ziplist結構體
    zlentry cur, next;

    while (p[0] != ZIP_END) {

        //將p所指向的列表節點的信息全部保存到cur中
        cur = zipEntry(p);
        //當前p節點的整個entry的字節數
        rawlen = cur.headersize + cur.len;
        //調用函數zipPrevEncodeLength,由於第一個參數爲NULL,則僅返回編碼len所需的字節數量
        //存儲rawlen需要的字節數
        rawlensize = zipPrevEncodeLength(NULL,rawlen);

        /* Abort if there is no next entry. */
        //如果已經沒有後續空間需要更新了,跳出
        //到達表尾
        if (p[rawlen] == ZIP_END) break;

        //取出後續節點的信息,保存到next結構中
        next = zipEntry(p+rawlen);

        /* Abort when "prevlen" has not changed. */
        //後續節點編碼當前節點的空間已經足夠,無需再進行任何處理,跳出
        //可以證明,只要遇到一個空間足夠的節點,
        //那麼這個節點之後的所有節點的空間都是足夠的
        if (next.prevrawlen == rawlen) break;

        if (next.prevrawlensize < rawlensize) {
            /* The "prevlen" field of "next" needs more bytes to hold
             * the raw length of "cur". */
            //執行到這裏,表示next空間的大小不足以編碼cur的長度
            //所以程序需要對next節點的(header部分)空間進行擴展

            //記錄p的偏移量
            offset = p-zl;
            //計算需要增加的字節數
            extra = rawlensize-next.prevrawlensize;
            //調用ziplistResize函數,調整ziplist的大小爲len字節
            //當ziplist原有的大小小於len時,擴展ziplist不會改變ziplist原有的元素
            zl = ziplistResize(zl,curlen+extra);
            p = zl+offset;

            /* Current pointer and offset for next element. */
            //新的下一個節點的首地址
            np = p+rawlen;
            //新節點的偏移量
            noffset = np-zl;

            /* Update tail offset when next element is not the tail element. */
            if ((zl+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))) != np) {
                //ZIPLIST_TAIL_OFFSET(zl):記錄到達表尾節點的偏移量
                //當 np不是尾節點時
                //更新zl的尾節點的偏移量
                ZIPLIST_TAIL_OFFSET(zl) =
                    intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+extra);
            }
            //當next節點不是表尾節點時,更新列表到表尾節點的偏移量
            //
            //不用更新的情況(next爲表尾節點):
            //
            //  |    | next |     ==>  |    | new next      |
            //       ^                      ^
            //       |                      |
            //      tail                   tail
            //
            // 需要更新的情況(next不是表位節點):
            //
            // | next |    |      ==>  | new next     |    |
            //        ^                       ^
            //        |                       |
            //    old  tail                old tail
            // 更新之後:
            // | new next    |    |
            //               ^
            //               |
            //            new tail             
            //
            /* Move the tail to the back. */

            //np + rawlensize :新的下一個節點存儲自身數據的首地址
            //np + next.prevrawlensize :舊的下一個節點存儲自身數據的首地址

            //向後移動cur節點之後的數據,爲cur的新header騰出空間
            //
            //例:
            //  | header | value |  ==> | header |    | value |  ==> | header       | value |
            //                                   |<-->|
            //                                 爲新header騰出的空間
            memmove(np+rawlensize,
                np+next.prevrawlensize,
                curlen-noffset-next.prevrawlensize-1);
            //對前置節點p的長度進行編碼寫入到np中
            zipPrevEncodeLength(np,rawlen);

            /* Advance the cursor */
            //移動指針,處理下一個節點
            p += rawlen;
            //更新ziplist所佔用的字節數
            curlen += extra;
        } else {
            //執行到這裏說明next節點編碼前置節點header空間有5字節
            //而編碼rawlen只需要1字節
            //但是程序不會對next進行縮小
            //所以這裏只將rawlen寫入5字節的header中算了
            if (next.prevrawlensize > rawlensize) {
                /* This would result in shrinking, which we want to avoid.
                 * So, set "rawlen" in the available bytes. */
                zipPrevEncodeLengthForceLarge(p+rawlen,rawlen);
            } else {
                //運行到這正好說明cur節點的長度正好可以編碼next節點的header中
                zipPrevEncodeLength(p+rawlen,rawlen);
            }

            /* Stop here, as the raw length of "next" has not changed. */
            //後續節點不用擴展
            break;
        }
    }
    return zl;
}

/* Delete "num" entries, starting at "p". Returns pointer to the ziplist. 
 *
 * 從位置p開始,連續刪除num個節點
 *
 * 函數返回值爲處理刪除操作之後的ziplist
 *
 * */
static unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) {
    unsigned int i, totlen, deleted = 0;
    size_t offset;
    int nextdiff = 0;
    zlentry first, tail;
    //使用first記錄節點p的所有信息
    first = zipEntry(p);
    //計算被刪除節點的總個數
    for (i = 0; p[0] != ZIP_END && i < num; i++) {
        //zipRawEntryLength:用於計算節點p所佔的節點數
        p += zipRawEntryLength(p);
        deleted++;
    }
    //totlen用於記錄所有被刪除節點佔用的內存字節數
    totlen = p-first.p;
    if (totlen > 0) {
        if (p[0] != ZIP_END) {

            //執行到這裏說明被刪除的節點之後還有節點存在


            /* Storing `prevrawlen` in this entry may increase or decrease the
             * number of bytes required compare to the current `prevrawlen`.
             * There always is room to store this, because it was previously
             * stored by an entry that is now being deleted. */

            //因爲位於被刪除節點範圍之後的第一個節點的header部分的大小
            //可能容納不了新的前置節點,所以需要計算新舊前置節點的字節數差
            nextdiff = zipPrevLenByteDiff(p,first.prevrawlen);
            //將指針p後退nextdiff個字節,爲新的header空出空間
            p -= nextdiff;
            //將first前置節點的長度編碼至p中
            zipPrevEncodeLength(p,first.prevrawlen);

            /* Update offset for tail */
            //更新到達表尾的偏移量
            ZIPLIST_TAIL_OFFSET(zl) =
                intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))-totlen);

            /* When the tail contains more than one entry, we need to take
             * "nextdiff" in account as well. Otherwise, a change in the
             * size of prevlen doesn't have an effect on the *tail* offset. */

            //如果被刪除節點之後,有多於一個節點
            //那麼程序需要將nextdiff記錄的字節數也計算到表尾偏移量中
            //這樣才能讓表尾的偏移量正確對齊表尾節點
            tail = zipEntry(p);
            if (p[tail.headersize+tail.len] != ZIP_END) {
                ZIPLIST_TAIL_OFFSET(zl) =
                   intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
            }

            /* Move tail to the front of the ziplist */
            //從表尾向表頭移動數據,覆蓋被刪除節點的數據
            memmove(first.p,p,
                intrev32ifbe(ZIPLIST_BYTES(zl))-(p-zl)-1);
        } else {
            //執行到這裏時,說明被刪除節點之後沒有其他的節點
            /* The entire tail was deleted. No need to move memory. */
            //更新表尾節點的偏移量
            ZIPLIST_TAIL_OFFSET(zl) =
                intrev32ifbe((first.p-zl)-first.prevrawlen);
        }

        //縮小並更新ziplist的長度
        /* Resize and update length */
        offset = first.p-zl;
        zl = ziplistResize(zl, intrev32ifbe(ZIPLIST_BYTES(zl))-totlen+nextdiff);
        ZIPLIST_INCR_LENGTH(zl,-deleted);
        p = zl+offset;

        /* When nextdiff != 0, the raw length of the next entry has changed, so
         * we need to cascade the update throughout the ziplist */
        //如果p所指向的節點的大小已經變更,那麼進行級聯更新
        //檢查p之後的所有節點是否符合ziplist編碼的的要求
        if (nextdiff != 0)
            zl = __ziplistCascadeUpdate(zl,p);
    }
    return zl;
}

 <span style="font-size:18px;"> 基於刪除函數的封裝: 
</span>

/* Delete a single entry from the ziplist, pointed to by *p.
 * Also update *p in place, to be able to iterate over the
 * ziplist, while deleting entries. 
 *
 * 從zl中刪除*p所指向的節點,
 * 並且原地更新*p所指向的位置,使得可以在迭代列表的過程中對節點進行刪除
 *
 * */
unsigned char *ziplistDelete(unsigned char *zl, unsigned char **p) {
    //因爲在_ziplistDelete時會對zl進行內存的重分配
    //而內存的重分配可能會改變zl的內存地址
    //所以需要記錄到達*p的偏移量
    //這樣可以在刪除節點之後通過偏移量來將*p還原到正確的位置
    size_t offset = *p-zl;
    //調用函數_ziplistDelete
    zl = __ziplistDelete(zl,*p,1);

    /* Store pointer to current element in p, because ziplistDelete will
     * do a realloc which might result in a different "zl"-pointer.
     * When the delete direction is back to front, we might delete the last
     * entry and end up with "p" pointing to ZIP_END, so check this. */
    *p = zl+offset;
    return zl;
}

/* Delete a range of entries from the ziplist.
 *
 * 從index索引指定的節點開始,連續地從zl中刪除num個節點
 *
 * */
unsigned char *ziplistDeleteRange(unsigned char *zl, unsigned int index, unsigned int num) {
    //根據索引定位到節點
    unsigned char *p = ziplistIndex(zl,index);
    //如果p == NULL 說明根據下標沒有定位到,
    //如果不爲空時,調用函數_ziplistDelete進行刪除
    return (p == NULL) ? zl : __ziplistDelete(zl,p,num);
}

(3)壓縮列表的插入:

<span style="font-size:18px;">/* Insert item at "p". 
 *
 * 根據指針p所指定的位置,將長度爲slen的字符串s插入到zl中。
 *
 * 函數返回值爲完成插入操作之後的ziplist
 *
 * */
static unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {
    //curlen記錄ziplist佔用內存總的字節數
    size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), reqlen;
    unsigned int prevlensize, prevlen = 0;
    size_t offset;
    int nextdiff = 0;
    unsigned char encoding = 0;
    long long value = 123456789; /* initialized to avoid warning. Using a value
                                    that is easy to see if for some reason
                                    we use it uninitialized. */
    zlentry tail;

    /* Find out prevlen for the entry that is inserted. */
    if (p[0] != ZIP_END) {
        //如果p[0]不指向列表末端,說明列表非空,並且p正指向列表中的一個節點
        //調用函數ZIP_DECODE_PREVLEN:解碼p指針,
        //使得prevlensize用來保存p節點的前置節點長度所需的字節數
        //prevlen用於保存前置節點的長度
        ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
    } else {
        //如果p指向表尾末端,那麼程序需要檢查序列是否爲:
        //   (1)如果ptail也指向表尾節點ZIP_END,那麼列表爲空
        //   (2)如果列表不爲空,那麼ptail將指向列表的最後一個節點
        unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl);
        if (ptail[0] != ZIP_END) {
            //如果列表不爲空時:
            //prevlen用於記錄指針ptail所指向的節點佔用的字節數總和
            prevlen = zipRawEntryLength(ptail);
        }
    }

    /* See if the entry can be encoded */
    //調用函數zipTryEncoding:用於檢查s指向的字符串能否被編碼爲整數
    //如果可以的話,將編碼後的整數保存在指針value中,並將編碼方式保存在指針encoding的值中。
    if (zipTryEncoding(s,slen,&value,&encoding)) {
         //當編碼成功時,reqlen用於保存encoding編碼的值所需的字節數量
        /* 'encoding' is set to the appropriate integer encoding */
        reqlen = zipIntSize(encoding);
    } else {
        //當編碼失敗時,reqlen用於保存字符串s的長度
        /* 'encoding' is untouched, however zipEncodeLength will use the
         * string length to figure out how to encode it. */
        reqlen = slen;
    }
    /* We need space for both the length of the previous entry and
     * the length of the payload. */
    //調用函數zipPrevEncodeLength 僅返回編碼前置節點長度Prevlen所需的字節數量
    reqlen += zipPrevEncodeLength(NULL,prevlen);
    //編碼當前節點所需的字節數量
    reqlen += zipEncodeLength(NULL,encoding,slen);

    /* When the insert position is not equal to the tail, we need to
     * make sure that the next entry can hold this entry's length in
     * its prevlen field. */
    //只要新節點不是被添加到列表末端
    //那麼程序就需要檢查p所指向的節點(的header)能否編碼新節點的長度。
    //nextdiff保存了新舊編碼之間的字節大小差,如果這個值大於0,
    //那麼說明需要對p所指向的節點(的header)進行擴展

    nextdiff = (p[0] != ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0;

    /* Store offset because a realloc may change the address of zl. */
    //因爲重分配空間可能會改變zl的地址,
    //所以再分配之前,需要記錄zl到p的偏移量,然後在分配之後依靠偏移量還原p
    offset = p-zl;
    //curlen:ziplist原來的長度
    //reqlen:整個新節點的長度
    //nextdiff:新節點的後繼節點擴展header的長度(要麼0字節,要麼4字節)
    zl = ziplistResize(zl,curlen+reqlen+nextdiff);
    p = zl+offset;

    /* Apply memory move when necessary and update tail offset. */
    if (p[0] != ZIP_END) {
        //當列表不爲空時,
        /* Subtract one because of the ZIP_END bytes */
        //新節點之後還有節點,因爲新節點的加入,需要對這些原有的節點進行調整
        //移動現有元素,爲新元素的插入空間騰出位置
        memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff);

        /* Encode this entry's raw length in the next entry. */
        //將新節點的長度編碼至後置節點
        //p + reqlen 定位到後置節點
        //reqlen是新節點的長度
        zipPrevEncodeLength(p+reqlen,reqlen);

        /* Update offset for tail */
        //更新表尾的偏移量,將新節點的長度也算上
        ZIPLIST_TAIL_OFFSET(zl) =
            intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen);

        /* When the tail contains more than one entry, we need to take
         * "nextdiff" in account as well. Otherwise, a change in the
         * size of prevlen doesn't have an effect on the *tail* offset. */
        //如果新節點的後面多於一個節點
        //那麼程序需要將nextdiff記錄的字節數也計算到表尾偏移量中
        //這樣才能讓表尾偏移量正確對齊表尾節點
        tail = zipEntry(p+reqlen);
        if (p[reqlen+tail.headersize+tail.len] != ZIP_END) {
            ZIPLIST_TAIL_OFFSET(zl) =
                intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
        }
    } else {
        //新元素爲表尾節點
        /* This element will be the new tail. */
        ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(p-zl);
    }

    /* When nextdiff != 0, the raw length of the next entry has changed, so
     * we need to cascade the update throughout the ziplist */
    //當nextdiff != 0時,新節點的後繼節點的(header部分)長度已經被改變,
    //所以需要級聯地更新後續的節點
    if (nextdiff != 0) {
        offset = p-zl;
        zl = __ziplistCascadeUpdate(zl,p+reqlen);
        p = zl+offset;
    }

    /* Write the entry */
    //調用函數zipPrevEncodeLength:對前置節點的長度進行編碼,並將它寫入p中,
    //返回編碼prevlen所需的字節數量
    p += zipPrevEncodeLength(p,prevlen);
    //調用函數zipEncodeLength:編碼節點長度值slen,並將它寫入到p中,然後返回編碼slen所需的字節數量
    p += zipEncodeLength(p,encoding,slen);
    //如果是編碼字符串
    if (ZIP_IS_STR(encoding)) {
        //內存拷貝
        memcpy(p,s,slen);
    } else {
        //否則,以encoding指定的編碼方式,將整數值value寫入到p
        zipSaveInteger(p,value,encoding);
    }
    //更新列表的節點數量計數器
    ZIPLIST_INCR_LENGTH(zl,1);
    return zl;
}</span>
   基於插入函數封裝的一個函數:

/*
 * 將長度爲slen的字符串s推入到zl中。
 *
 * where 參數的值決定了推入方向:
 * -值爲ZIPLIST_HEAD時,將新值推入到表頭
 * -否則,將新值推入到表末端
 *
 *  函數的返回值:添加新值後的ziplist
 * */
unsigned char *ziplistPush(unsigned char *zl, unsigned char *s, unsigned int slen, int where) {
    unsigned char *p;
    //根據where的值,決定將值推入到表頭還是表尾
    p = (where == ZIPLIST_HEAD) ? ZIPLIST_ENTRY_HEAD(zl) : ZIPLIST_ENTRY_END(zl);
    //調用插入函數
    return __ziplistInsert(zl,p,s,slen);
}
(3)壓縮列表有關的查找(比如返回下標對應的節點,返回前置節點,返回後置節點等):

/* Returns an offset to use for iterating with ziplistNext. When the given
 * index is negative, the list is traversed back to front. When the list
 * doesn't contain an element at the provided index, NULL is returned.
 *
 * 根據給定的索引,遍歷列表,並返回索引指定節點的指針
 *
 * 如果索引爲正,那麼從表頭向表尾遍歷
 * 如果索引爲負,那麼從表尾向表頭遍歷
 * 正數索引從0開始,負數索引從-1開始
 *
 * 如果索引超過列表的節點數量,或者列表爲空,那麼返回NULL
 *
 * */
unsigned char *ziplistIndex(unsigned char *zl, int index) {
    unsigned char *p;
    unsigned int prevlensize, prevlen = 0;
    if (index < 0) {
        //當索引爲負數時
        index = (-index)-1;
        //定位表尾節點
        p = ZIPLIST_ENTRY_TAIL(zl);
        if (p[0] != ZIP_END) {
            //如果列表不爲空時,
            //調用函數ZIP_DECODE_PREVLEN:解碼p指針
            //prevlensize:保存編碼前置節點長度所所需的字節數
            //prevlen:保存前置節點的長度值
            ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
            //從表尾向表頭遍歷
            while (prevlen > 0 && index--) {
                p -= prevlen;
                ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
            }
        }
    } else {
        //當index > 0時
        //定位表頭節點
        p = ZIPLIST_ENTRY_HEAD(zl);
        //從表頭開始遍歷節點
        while (p[0] != ZIP_END && index--) {
            p += zipRawEntryLength(p);
        }
    }
    //返回結果
    return (p[0] == ZIP_END || index > 0) ? NULL : p;
}

/* Return pointer to next entry in ziplist.
 *
 * zl is the pointer to the ziplist
 * p is the pointer to the current element
 *
 * The element after 'p' is returned, otherwise NULL if we are at the end.
 *
 * 返回p所指向節點的後置節點
 *
 * 如果p爲表末端,或者p已經是表尾節點,那麼返回NULL
 *
 * */
unsigned char *ziplistNext(unsigned char *zl, unsigned char *p) {
    ((void) zl);

    /* "p" could be equal to ZIP_END, caused by ziplistDelete,
     * and we should return NULL. Otherwise, we should return NULL
     * when the *next* element is ZIP_END (there is no next entry). */
    //如果p已經指向列表末端
    if (p[0] == ZIP_END) {
        return NULL;
    }
    //指向p的後一個節點
    p += zipRawEntryLength(p);
    if (p[0] == ZIP_END) {
        //p已經是表尾節點,沒有後置節點
        return NULL;
    }
    
    return p;
}

/* Return pointer to previous entry in ziplist. 
 *
 * 返回p所指向節點的前置節點
 *
 * 如果p指向爲空列表,或者p已經指向表頭節點,那麼返回NULL
 *
 * */
unsigned char *ziplistPrev(unsigned char *zl, unsigned char *p) {
    unsigned int prevlensize, prevlen = 0;

    /* Iterating backwards from ZIP_END should return the tail. When "p" is
     * equal to the first element of the list, we're already at the head,
     * and should return NULL. */
    //如果p指向表列表末端
    if (p[0] == ZIP_END) {
        //定位表尾節點
        p = ZIPLIST_ENTRY_TAIL(zl);
        //如果表尾節點也指向列表末端,那麼列表爲空,返回NULL,
        //否則,返回p
        return (p[0] == ZIP_END) ? NULL : p;
    } else if (p == ZIPLIST_ENTRY_HEAD(zl)) {
        //如果p指向表頭節點,則其前置節點爲NULL
        return NULL;
    } else {
        //當p既不是表頭也不是表尾時,
        //調用ZIP_DECODE_PREVLEN函數:獲得編碼前置節點需要的字節數和前置節點的長度
        ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
        //保證prevlen > 0
        assert(prevlen > 0);
        //移動指針指向前一個節點
        return p-prevlen;
    }
}
(4)獲取節點值:

<span style="font-size:18px;">/* Get entry pointed to by 'p' and store in either '*sstr' or 'sval' depending
 * on the encoding of the entry. '*sstr' is always set to NULL to be able
 * to find out whether the string pointer or the integer value was set.
 * Return 0 if 'p' points to the end of the ziplist, 1 otherwise. 
 *
 * 取出p所指向節點的值:
 *
 *  -如果節點保存的是字符串,那麼將字符串指針保存到*sstr中,字符串長度保存到*slen
 *
 *  -如果節點保存的是整數,那麼將整數保存到*sval
 *
 *  程序可以通過檢查*sstr是否爲NULL來檢查值是字符串還是整數
 *
 *  提取成功返回1
 *  如果p爲空,或者p指向的是列表末端,那麼返回0,提取值失敗
 *
 * */

unsigned int ziplistGet(unsigned char *p, unsigned char **sstr, unsigned int *slen, long long *sval) {
    zlentry entry;
    //如果p == NULL 或 列表爲空時,返回0,提取失敗
    if (p == NULL || p[0] == ZIP_END) return 0;
    if (sstr) *sstr = NULL;
    //取出p所指向的節點的各項信息,並保存到結構entry中
    entry = zipEntry(p);
    if (ZIP_IS_STR(entry.encoding)) {
        //當節點的值爲字符串,將字符串長度保存到*slen,字符串保存到*sstr
        if (sstr) {
            //字符串長度
            *slen = entry.len;
            //字符串內容
            *sstr = p+entry.headersize;
        }
    } else {
        //節點的值爲整數時,
        if (sval) {
            //調用函數zipLoadInteger,以encoding指定的編碼方式,讀取並返回指針p中的整數值
            *sval = zipLoadInteger(p+entry.headersize,entry.encoding);
        }
    }
    return 1;
}
</span>




 (5)節點值的比較:

/* Compare entry pointer to by 'p' with 'sstr' of length 'slen'. */
/* Return 1 if equal. 
 *
 * 將p所指向的節點的值和sstr進行對比
 *
 * 如果節點值和sstr的值相等,返回1,不相等返回0
 *
 * */
unsigned int ziplistCompare(unsigned char *p, unsigned char *sstr, unsigned int slen) {
    zlentry entry;
    unsigned char sencoding;
    long long zval, sval;
    //如果列表爲空,則返回0
    if (p[0] == ZIP_END) return 0;
    //取出節點p所對應的信息保存於entry中
    entry = zipEntry(p);
    if (ZIP_IS_STR(entry.encoding)) {
        //如果節點值爲字符串,進行字符串對比
        /* Raw compare */
        if (entry.len == slen) {
            //如果節點p中保存的字符串長度 == slen
            //調用字符串對比函數
            return memcmp(p+entry.headersize,sstr,slen) == 0;
        } else {
            //兩長度不相等時,返回0
            return 0;
        }
    } else {
        //如果節點值爲整數時,
        /* Try to compare encoded values. Don't compare encoding because
         * different implementations may encoded integers differently. */

        //調用函數zipEncoding:檢查sstr中指向的字符串能否被編碼爲整數
        //如果可以編碼的話,將編碼後的整數保存在sval中,將編碼方式保存在sencoding中
        if (zipTryEncoding(sstr,slen,&sval,&sencoding)) {
          //當編碼成功時,調用函數獲取p節點的整數值
          zval = zipLoadInteger(p+entry.headersize,entry.encoding);
          return zval == sval;
        }
    }
    return 0;
}

/* Find pointer to the entry equal to the specified entry. Skip 'skip' entries
 * between every comparison. Returns NULL when the field could not be found. 
 *
 * 尋找節點值和vstr相等的列表節點,並返回該節點的指針
 *
 * 每次對比之前都跳過skip個節點
 *
 * 如果找不到相應的節點,則返回NULL
 *
 * */
unsigned char *ziplistFind(unsigned char *p, unsigned char *vstr, unsigned int vlen, unsigned int skip) {
    int skipcnt = 0;
    unsigned char vencoding = 0;
    long long vll = 0;
    //只要沒有到達列表末節點,就一直迭代
    while (p[0] != ZIP_END) {
        unsigned int prevlensize, encoding, lensize, len;
        unsigned char *q;
        //獲得編碼前置節點長度所需的字節數
        ZIP_DECODE_PREVLENSIZE(p, prevlensize);
        //調用函數獲取列表節點的相關信息
        //encoding:保存節點的編碼類型
        //lensize:保存編碼節點長度所需的字節數
        //len:保存節點的長度
        ZIP_DECODE_LENGTH(p + prevlensize, encoding, lensize, len);
        q = p + prevlensize + lensize;

        if (skipcnt == 0) {
            /* Compare current entry with specified entry */
            if (ZIP_IS_STR(encoding)) {
                //如果p中保存的是字符串,
                //對比字符串
                if (len == vlen && memcmp(q, vstr, vlen) == 0) {
                    return p;
                }
            } else {
                /* Find out if the searched field can be encoded. Note that
                 * we do it only the first time, once done vencoding is set
                 * to non-zero and vll is set to the integer value. */
                //因爲傳入值有可能被編碼了
                //所以當第一次進行值對比時,程序會對傳入值進行解碼
                //這個解碼操作只會進行1次
                if (vencoding == 0) {
                    //調用函數zipTryEncoding:嘗試將vstr編碼爲整數
                    if (!zipTryEncoding(vstr, vlen, &vll, &vencoding)) {
                        /* If the entry can't be encoded we set it to
                         * UCHAR_MAX so that we don't retry again the next
                         * time. */
                        //當編碼失敗時,
                        vencoding = UCHAR_MAX;
                    }
                    /* Must be non-zero by now */
                    assert(vencoding);
                }

                /* Compare current entry with specified entry, do it only
                 * if vencoding != UCHAR_MAX because if there is no encoding
                 * possible for the field it can't be a valid integer. */
                if (vencoding != UCHAR_MAX) {
                    //當程序執行到這時,說明vstr可以編碼爲整數。
                    //q代表節點p所佔的字節數,將p根據編碼轉化爲整數
                    long long ll = zipLoadInteger(q, encoding);
                    if (ll == vll) {
                        return p;
                    }
                }
            }

            /* Reset skip count */
            skipcnt = skip;
        } else {
            /* Skip entry */
            skipcnt--;
        }

        /* Move to next entry */
        //指針後移
        p = q + len;
    }

    return NULL;
}

/* Return length of ziplist.
 *
 * 返回ziplist中的節點個數
 *
 * */
unsigned int ziplistLen(unsigned char *zl) {
    unsigned int len = 0;
    //節點數小於< UINT16_MAX,直接返回其長度
    if (intrev16ifbe(ZIPLIST_LENGTH(zl)) < UINT16_MAX) {
        len = intrev16ifbe(ZIPLIST_LENGTH(zl));
    } else {
        //當節點數大於UINT16_MAX,需要遍歷整個列表才能計算出節點
        unsigned char *p = zl+ZIPLIST_HEADER_SIZE;
        while (*p != ZIP_END) {
            p += zipRawEntryLength(p);
            len++;
        }

        /* Re-store length if small enough */
        if (len < UINT16_MAX) ZIPLIST_LENGTH(zl) = intrev16ifbe(len);
    }
    return len;
}

/* Return ziplist blob size in bytes. 
 *
 * 返回整個ziplist佔用的內存字節數
 * */
size_t ziplistBlobLen(unsigned char *zl) {
    return intrev32ifbe(ZIPLIST_BYTES(zl));
}


上述則是所有redis對於壓縮列表的實現!!!源代碼我也做了比較清楚的說明,歡迎大家找茬!!!

看過源碼的人應該知道,壓縮列表在插入和刪除時效率是比較低的,因爲在壓縮列表的節點的結構體中封裝了一個叫prevlensize的成員,該成員記錄的是編碼前置節點所需的字節數,和prevlen記錄前置節點的長度,在插入的過程中就會出現連鎖更新的現象,刪除也是如此,對於插入和刪除的最壞的時間複雜度達到o(n^2),所以壓縮列表不太適合數據量比較的時候,它適用於數據量不大的情況下。。。

發佈了65 篇原創文章 · 獲贊 19 · 訪問量 3萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章