[redis 源碼走讀] rdb 持久化 - 應用場景

此博客將逐步遷移到作者新的博客,可以點擊此處進入。

rdb 文件是一個經過壓縮的二進制文件,是 redis 持久化方式之一。本章主要講 rdb 應用場景。



配置

redis 有兩種持久化方式,分別爲:aof 和 rdb,默認開啓 rdb,本章重點講 rdb。

# redis.conf
appendonly no

standardConfig configs[] = {
    ...
   createBoolConfig("appendonly", NULL, MODIFIABLE_CONFIG, server.aof_enabled, 0, NULL, updateAppendonly),
   ...
}

void initServer(void) {
    ...
    server.aof_state = server.aof_enabled ? AOF_ON : AOF_OFF;
    ...
}

異步持久化

redis 主邏輯是在單進程,單線程裏實現的。像持久化這種耗大量性能的操作,主進程一般會通過 fork 子進程異步進行。

// 主進程 fork 子進程存盤
int rdbSaveBackground(char *filename, rdbSaveInfo *rsi) {
    ...
    if ((childpid = redisFork()) == 0) {
        ...
        /* Child */
        retval = rdbSave(filename,rsi);
        ...
    }
    ...
}

應用場景

快照應用場景

服務啓動加載數據

redis 程序啓動,從磁盤 rdb 文件加載數據到內存。

int main(int argc, char **argv) {
    ...
    if (!server.sentinel_mode) {
        loadDataFromDisk();
    }
}

/* flags on the purpose of rdb save or load */
#define RDBFLAGS_NONE 0
#define RDBFLAGS_AOF_PREAMBLE (1<<0)
#define RDBFLAGS_REPLICATION (1<<1)

/* Function called at startup to load RDB or AOF file in memory. */
void loadDataFromDisk(void) {
    long long start = ustime();
    if (server.aof_state == AOF_ON) {
        if (loadAppendOnlyFile(server.aof_filename) == C_OK)
            ...
    } else {
        rdbSaveInfo rsi = RDB_SAVE_INFO_INIT;
        if (rdbLoad(server.rdb_filename,&rsi,RDBFLAGS_NONE) == C_OK) {
            ...
        }
    }
    ...
}

命令

  • SAVE 命令同步存盤。
void saveCommand(client *c) {
    ...
    if (rdbSave(server.rdb_filename,rsiptr) == C_OK) {
        addReply(c,shared.ok);
    } else {
        addReply(c,shared.err);
    }
}
  • BGSAVE 命令,主進程通過 fork 子進程進行異步存盤。
void bgsaveCommand(client *c) {
    ...
    if (server.rdb_child_pid != -1) {
        addReplyError(c,"Background save already in progress");
    } else if (hasActiveChildProcess()) {
        if (schedule) {
            server.rdb_bgsave_scheduled = 1;
            addReplyStatus(c,"Background saving scheduled");
        } else {
            ...
        }
    } else if (rdbSaveBackground(server.rdb_filename,rsiptr) == C_OK) {
        addReplyStatus(c,"Background saving started");
    }
    ...
}
  • FLUSHALL 清空數據庫後存盤。
void flushallCommand(client *c) {
    ...
    flushAllDataAndResetRDB(flags);
    ...
}

/* Flushes the whole server data set. */
void flushAllDataAndResetRDB(int flags) {
    server.dirty += emptyDb(-1,flags,NULL);
    if (server.rdb_child_pid != -1) killRDBChild();
    if (server.saveparamslen > 0) {
        /* Normally rdbSave() will reset dirty, but we don't want this here
         * as otherwise FLUSHALL will not be replicated nor put into the AOF. */
        int saved_dirty = server.dirty;
        rdbSaveInfo rsi, *rsiptr;
        rsiptr = rdbPopulateSaveInfo(&rsi);
        rdbSave(server.rdb_filename,rsiptr);
        server.dirty = saved_dirty;
    }
    server.dirty++;
    ...
}
  • SHUTDOWN 命令關閉服務。
    服務運行過程中,一般情況是通過定期策略對內存數據進行持久化,內存數據和持久化文件數據不同步的,所以當服務正常退出或者重啓,需要將內存數據進行持久化。
void shutdownCommand(client *c) {
    ...
    if (prepareForShutdown(flags) == C_OK) exit(0);
    ...
}

int prepareForShutdown(int flags) {
    ...
    /* Create a new RDB file before exiting. */
    if ((server.saveparamslen > 0 && !nosave) || save) {
        ...
        rdbSaveInfo rsi, *rsiptr;
        rsiptr = rdbPopulateSaveInfo(&rsi);
        if (rdbSave(server.rdb_filename,rsiptr) != C_OK) {
            ...
        }
    }
    ...
}

數據定期持久化

rdb 持久化是有條件限制的:

  1. 數據修改個數。
  2. 存盤時間間隔。
  • 默認配置
    從默認配置看,rdb 持久化不是實時的。時間間隔,最大 900 秒(15 分鐘),最小 60 秒(1分鐘),所以用 rdb 做持久化丟失數據風險比較大。
# redis.conf
################################ SNAPSHOTTING  ################################
#
# Save the DB on disk:
#
#   save <seconds> <changes>
#
#   Will save the DB if both the given number of seconds and the given
#   number of write operations against the DB occurred.
#
#   In the example below the behaviour will be to save:
#   after 900 sec (15 min) if at least 1 key changed
#   after 300 sec (5 min) if at least 10 keys changed
#   after 60 sec if at least 10000 keys changed
#
#   Note: you can disable saving completely by commenting out all "save" lines.
#
#   It is also possible to remove all the previously configured save
#   points by adding a save directive with a single empty string argument
#   like in the following example:
#
#   save ""

save 900 1
save 300 10
save 60 10000
// rdb 定期存盤參數
struct saveparam {
    time_t seconds; // 時間間隔
    int changes;    // 修改次數
};
  • 時鐘定期檢查將內存數據進行 rdb 持久化。
#define CONFIG_BGSAVE_RETRY_DELAY 5 /* Wait a few secs before trying again. */

struct redisServer {
    ...
    long long dirty;                /* Changes to DB from the last save */
    time_t lastsave;                /* Unix time of last successful save */
    time_t lastbgsave_try;          /* Unix time of last attempted bgsave */
    ...
}

int hasActiveChildProcess() {
    return server.rdb_child_pid != -1 ||
           server.aof_child_pid != -1 ||
           server.module_child_pid != -1;
}

int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
    ...
    if (hasActiveChildProcess() || ldbPendingChildren()) {
        // 如果後臺有子進程正在進行活動,檢查進程是否已經終止。
        checkChildrenDone();
    } else {
        for (j = 0; j < server.saveparamslen; j++) {
            struct saveparam *sp = server.saveparams+j;

            // 需要滿足默認數據保存頻率條件。
            // 如果上次存盤失敗後,需要延時 CONFIG_BGSAVE_RETRY_DELAY 再進行操作。
            if (server.dirty >= sp->changes &&
                server.unixtime-server.lastsave > sp->seconds &&
                (server.unixtime-server.lastbgsave_try >
                 CONFIG_BGSAVE_RETRY_DELAY ||
                 server.lastbgsave_status == C_OK))
            {
                serverLog(LL_NOTICE,"%d changes in %d seconds. Saving...",
                    sp->changes, (int)sp->seconds);
                rdbSaveInfo rsi, *rsiptr;
                rsiptr = rdbPopulateSaveInfo(&rsi);
                rdbSaveBackground(server.rdb_filename,rsiptr);
                break;
            }
        }
        ...
    }
    ...
    // 我們在執行 BGSAVE 命令時,當時有其它子進程正在進行工作,所以該命令被安排延後處理。
    if (!hasActiveChildProcess() &&
        server.rdb_bgsave_scheduled &&
        (server.unixtime-server.lastbgsave_try > CONFIG_BGSAVE_RETRY_DELAY ||
         server.lastbgsave_status == C_OK))
    {
        rdbSaveInfo rsi, *rsiptr;
        rsiptr = rdbPopulateSaveInfo(&rsi);
        if (rdbSaveBackground(server.rdb_filename,rsiptr) == C_OK)
            server.rdb_bgsave_scheduled = 0;
    }
    ...
}

重寫 aof 文件

aof 文件在重寫過程中,爲了快速將數據落地,也會將文件保存成 rdb 文件,rdb 文件裏會保存 aof 標識進行識別。

# redis.conf
#
# When rewriting the AOF file, Redis is able to use an RDB preamble in the
# AOF file for faster rewrites and recoveries. When this option is turned
# on the rewritten AOF file is composed of two different stanzas:
#
#   [RDB file][AOF tail]
#
# When loading Redis recognizes that the AOF file starts with the "REDIS"
# string and loads the prefixed RDB file, and continues loading the AOF
# tail.
aof-use-rdb-preamble yes
// 重寫 aof 文件
int rewriteAppendOnlyFile(char *filename) {
    ...
    startSaving(RDBFLAGS_AOF_PREAMBLE);
    if (server.aof_use_rdb_preamble) {
        int error;
        if (rdbSaveRio(&aof,&error,RDBFLAGS_AOF_PREAMBLE,NULL) == C_ERR) {
            errno = error;
            goto werr;
        }
    }
    ...
}

// 加載 aof 文件
int loadAppendOnlyFile(char *filename) {
    ...
    char sig[5]; /* "REDIS" */
    if (fread(sig,1,5,fp) != 5 || memcmp(sig,"REDIS",5) != 0) {
        /* No RDB preamble, seek back at 0 offset. */
        if (fseek(fp,0,SEEK_SET) == -1) goto readerr;
    } else {
        ...
        // 從 rdb 文件加載 aof 需要的數據。
        if (rdbLoadRio(&rdb,RDBFLAGS_AOF_PREAMBLE,NULL) != C_OK) {
            ...
        }
        ...
    }
    ...
}

信號終止進程

服務運行過程中,一般情況是通過定期策略對內存數據進行持久化,內存數據和持久化文件數據不同步的,所以當服務正常退出或者重啓,需要將內存數據進行持久化。

void initServer(void) {
    ...
    setupSignalHandlers();
    ...
}

#define SIGINT  2       /* interrupt */
#define SIGTERM 15      /* software termination signal from kill */

void setupSignalHandlers(void) {
    struct sigaction act;

    /* When the SA_SIGINFO flag is set in sa_flags then sa_sigaction is used.
     * Otherwise, sa_handler is used. */
    sigemptyset(&act.sa_mask);
    act.sa_flags = 0;
    act.sa_handler = sigShutdownHandler;
    sigaction(SIGTERM, &act, NULL);
    sigaction(SIGINT, &act, NULL);
    ...
}

static void sigShutdownHandler(int sig) {
    ...
    server.shutdown_asap = 1;
}

int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
    ...
    /* We received a SIGTERM, shutting down here in a safe way, as it is
     * not ok doing so inside the signal handler. */
    if (server.shutdown_asap) {
        if (prepareForShutdown(SHUTDOWN_NOFLAGS) == C_OK) exit(0);
        serverLog(LL_WARNING,"SIGTERM received but errors trying to shut down the server, check the logs for more information");
        server.shutdown_asap = 0;
    }
    ...
}

主從複製

主從複製,全量同步數據,可以通過 rdb 文件傳輸。rdb 文件可以採用硬盤備份方式;也可以無盤備份,數據不存盤,直接通過 socket 發送給其它服務。

從服務剛啓動或因網絡原因,與主服務長時間斷開,重連後發現主從數據已經嚴重不匹配了,主服務需要將內存數據保存成 rdb 二進制壓縮文件,傳送給這些重新鏈接的服務。

一主多從架構,如果出現網絡問題,極端情況,主服務要給多個從服務發送 rdb 文件數據,數據量大的話,可能會造成網絡擁堵,所以從服務儘量少吧。如果應用場景確實需要,可以啓用多級從服務(chained slaves (slaves of slaves)),避免主服務出現過載問題。

/* State of slaves from the POV of the master. Used in client->replstate.
 * In SEND_BULK and ONLINE state the slave receives new updates
 * in its output queue. In the WAIT_BGSAVE states instead the server is waiting
 * to start the next background saving in order to send updates to it. */
#define SLAVE_STATE_WAIT_BGSAVE_START 6 /* We need to produce a new RDB file. */
#define SLAVE_STATE_WAIT_BGSAVE_END 7 /* Waiting RDB file creation to finish. */
#define SLAVE_STATE_SEND_BULK 8 /* Sending RDB file to slave. */
#define SLAVE_STATE_ONLINE 9 /* RDB file transmitted, sending just updates. */

void syncCommand(client *c) {
    ...
    /* Setup the slave as one waiting for BGSAVE to start. The following code
     * paths will change the state if we handle the slave differently. */
    c->replstate = SLAVE_STATE_WAIT_BGSAVE_START;
    ...
}
void replicationCron(void) {
    ...
    /* 如果使用無硬盤備份,主服務會在開始傳送前等待一段時間(repl_diskless_sync_delay),
    這過程中可能有多個服務鏈接上來需要全量同步數據的,那麼一起同步。*/
    if (!hasActiveChildProcess()) {
        time_t idle, max_idle = 0;
        int slaves_waiting = 0;
        int mincapa = -1;
        listNode *ln;
        listIter li;

        // 遍歷從服務,確認是否需要主從複製。
        listRewind(server.slaves,&li);
        while((ln = listNext(&li))) {
            client *slave = ln->value;
            if (slave->replstate == SLAVE_STATE_WAIT_BGSAVE_START) {
                idle = server.unixtime - slave->lastinteraction;
                if (idle > max_idle) max_idle = idle;
                slaves_waiting++;
                mincapa = (mincapa == -1) ? slave->slave_capa :
                                            (mincapa & slave->slave_capa);
            }
        }

        if (slaves_waiting &&
            (!server.repl_diskless_sync ||
             max_idle > server.repl_diskless_sync_delay)) {
            startBgsaveForReplication(mincapa);
        }
    }
    ...
}

int startBgsaveForReplication(int mincapa) {
    ...
    if (rsiptr) {
        if (socket_target)
            retval = rdbSaveToSlavesSockets(rsiptr);
        else
            retval = rdbSaveBackground(server.rdb_filename,rsiptr);
    }
    ...
}

總結

rdb 作爲持久化方式的一種,它是一種經過壓縮的二進制數據。

  • 優點:持久化過程中,速度快,文件體積小。方便數據快速落地,或者通過網絡傳輸數據。

  • 缺點:

    1. redis 只是將 rdb 文件作爲一個備份文件而已,功能簡單,並不能從文件中做一些數據查詢功能操作。
    2. 備份常用方式是通過時鐘控制,不是實時的,異常情況丟失數據會比較多。如果把它作爲一個數據庫來應用,這顯然是不能接受的。

rdb 這一塊內容挺多的,一章節太長了,所以分開了兩章,本章主要講應用場景,文件結構請參考下一章 rdb 持久化 - 文件結構


參考


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章