首先我們要明確一個共識，我們通常所說的Redis單線程是指獲取 (socket 讀)、解析、執行、內容返回 (socket 寫) 等都由一個順序串行的主線程處理，這個主線程就是我們平時說的"單線程"，而其他的清理髒數據、無用連接的釋放、LRU淘汰策略等等也是有其他線程在處理的，因此其實在Redis6之前的Redis本質上也是多線程的。

爲什麼這些操作要放在同一個主線程中，官方給出的解釋：傳送門

通常瓶頸不在 CPU，而是在內存和網絡IO；
多線程會帶來線程不安全的情況；
多線程可能存在線程切換、甚至加鎖解鎖、死鎖造成的性能損耗；
單線程降低了Redis內部實現複雜度；
hash的惰性rehash，lpush等線程不安全的命令可以無鎖執行；

什麼是IO多線程

既然上面說單線程那麼好，爲什麼Redis6.0又要引入多線性呢？

Redis 抽象了一套 AE 事件模型，將 IO 事件和時間事件融入一起，同時藉助多路複用機制(linux上用epoll) 的回調特性，使得 IO 讀寫都是非阻塞的，實現高性能的網絡處理能力。加上 Redis 基於內存的數據處理，這就是 “單線程，但卻高性能” 的核心原因。

但 IO 數據的讀寫依然是阻塞的，這也是 Redis 目前的主要性能瓶頸之一，特別是在數據吞吐量特別大的時候，具體情況如下：

上圖的下半部分，當 socket 中有數據時，Redis 會通過系統調用將數據從內核態拷貝到用戶態，供 Redis 解析用。這個拷貝過程是阻塞的，術語稱作 “同步 IO”，數據量越大拷貝的延遲越高，時間消耗也越大，糟糕的是這些操作都是單線程處理的。（寫 reponse 時也是一樣）

這是 Redis 目前的瓶頸之一，Redis6.0 引入的 “多線程” 機制就是對於該瓶頸的優化。核心思路是，將主線程的 IO 讀寫任務拆分出來給一組獨立的線程執行，使得多個 socket 的讀寫可以並行化。

與 Memcached 從 IO 處理到數據訪問多線程的實現模式有些差異。Redis 的IO多線程只是用來處理網絡數據的讀寫和協議解析，執行命令仍然是單線程。之所以這麼設計是不想 Redis 因爲多線程而變得複雜，需要去控制 key、lua、事務，LPUSH/LPOP 等等的併發問題。

開啓IO多線程

默認情況下，Redis多線程是禁用的，我們可以在配置文件選擇開啓：

vim redis.conf

#開啓IO多線程
io-threads-do-reads yes

#配置線程數量，如果設爲1就是主線程模式。
io-threads 4

官方建議：至少4核的機器纔開啓IO多線程，並且除非真的遇到了性能瓶頸，否則不建議開啓此配置，且配置的線程數少於機器總線程數，如果有4核建議開啓2,3個線程，如果有8核建議開6線程。線程並不是越多越好，多於8個線程意義不大。

性能對比

因資源有限，我手邊的機器渣渣配置如下，開啓3個線程對比單線程：

配置：

[root@BD-T-uatredis9 ~]# free -h
              total        used        free      shared  buff/cache   available
Mem:            15G        1.0G         13G         64M        1.2G         14G
Swap:          4.0G          0B        4.0G
[root@BD-T-uatredis9 ~]# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             4
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E7-4809 v4 @ 2.10GHz
Stepping:              1
CPU MHz:               2094.952
BogoMIPS:              4189.90
Hypervisor vendor:     VMware
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              20480K
NUMA node0 CPU(s):     0-3

測試命令：

使用redis-benchmark進行壓測，這裏模擬在4核4線程的機器上分別測試3線程和單線程在100W請求，數據大小在128b,512b,1024b，200個客戶端，執行SET和GET的QPS性能對比

#三線程
./redis-benchmark -h localhost -p 6380 --user default -a wyk123456 -t set,get -n 1000000 -r 1000000 --threads 3 -d 128 -c 200 -q

./redis-benchmark -h localhost -p 6380 --user default -a wyk123456 -t set,get -n 1000000 -r 1000000 --threads 3 -d 512 -c 200 -q

./redis-benchmark -h localhost -p 6380 --user default -a wyk123456 -t set,get -n 1000000 -r 1000000 --threads 3 -d 1024 -c 200 -q

#單線程
./redis-benchmark -h localhost -p 6381 --user default -a wyk123456 -t set,get -n 1000000 -r 1000000  -d 128 -c 200 -q

./redis-benchmark -h localhost -p 6381 --user default -a wyk123456 -t set,get -n 1000000 -r 1000000  -d 512 -c 200 -q

./redis-benchmark -h localhost -p 6381 --user default -a wyk123456 -t set,get -n 1000000 -r 1000000  -d 1024 -c 200 -q

結果：

可能是我機器太渣了，3線程比單線程的QPS提升有120%~140%，網友測試的在4線程下QPS提升了100%。。

網友的測試結果：

Redis Server: 阿里雲 Ubuntu 18.04，8 CPU 2.5 GHZ, 8G 內存，主機型號 ecs.ic5.2xlarge
Redis Benchmark Client: 阿里雲 Ubuntu 18.04，8 2.5 GHZ CPU, 8G 內存，主機型號 ecs.ic5.2xlarge

注意，數據僅供驗證參考，不能作爲線上指標：

本測試只是使用早期的 unstble 分支的性能，不排除穩定版的性能會更好。

本測試並沒有針對嚴謹的延時控制和不同併發的場景進行壓測。

源碼解析

剛纔提到IO多線程只是在網絡數據的讀寫上是多線程了，具體流程如下：

流程：

主線程獲取 socket 放入等待列表
將 socket 分配給各個 IO 線程（並不會等列表滿）
主線程阻塞等待 IO 線程讀取 socket 完畢
主線程以單線程執行命令（如果命令沒有接收完畢，會等 IO 下次繼續）
主線程阻塞等待 IO 線程將數據回寫 socket 完畢（一次沒寫完，會等下次再寫）
解除綁定，清空等待隊列

IO 線程要麼同時在讀 socket，要麼同時在寫，不會同時讀或寫；

IO 線程只負責讀寫 socket 解析命令，不負責執行命令，由主線程串行執行命令；

IO 線程數可配置，默認爲 1；

上面的過程是完全無鎖的，因爲在 IO 線程處理的時主線程會等待全部的 IO 線程完成，所以不會出現 data race 的場景。

源碼：

https://github.com/redis-io/redis/blob/6.0/src/networking.c

redis-server 邏輯首先執行 initThreadedIO()函數對線程進行初始化，當然，也包括根據配置 server.io_threads_num 控制線程個數，其中主線程的處理邏輯爲 IOThreadMain() 函數

/* networking.c: line 2666 */
void *IOThreadMain(void *myid) {
    /* The ID is the thread number (from 0 to server.iothreads_num-1), and is used by the thread to just manipulate a single sub-array of clients. */
    // 線程 ID，跟普通線程池的操作方式一樣，都是通過 線程ID 進行操作
    long id = (unsigned long)myid;
    while(1) {
        /* Wait for start */
        // 這裏的等待操作比較特殊，沒有使用簡單的 sleep，避免了 sleep 時間設置不當可能導致糟糕的性能，但是也有個問題就是頻繁 loop 可能一定程度上造成 cpu 佔用較長
        for (int j = 0; j < 1000000; j++) {
            if (io_threads_pending[id] != 0) break;
        }
        /* Give the main thread a chance to stop this thread. */
        if (io_threads_pending[id] == 0) {
            pthread_mutex_lock(&io_threads_mutex[id]);
            pthread_mutex_unlock(&io_threads_mutex[id]);
            continue;
        }
        serverAssert(io_threads_pending[id] != 0);
        // debug 模式
        if (tio_debug) printf("[%ld] %d to handle\n", id, (int)listLength(io_threads_list[id]));
        /* Process: note that the main thread will never touch our list
         * before we drop the pending count to 0. */
        // 根據線程 id 以及待分配列表進行 任務分配
        listIter li;
        listNode *ln;
        listRewind(io_threads_list[id],&li);
        while((ln = listNext(&li))) {
            client *c = listNodeValue(ln);
            // 判斷讀寫類型
            if (io_threads_op == IO_THREADS_OP_WRITE) {
                writeToClient(c,0);
            } else if (io_threads_op == IO_THREADS_OP_READ) {
                // 這裏需要注意重複調用了 readQueryFromClient，不過不用擔心，有 CLIENT_PENDING_READ 標識可以進行識別
                readQueryFromClient(c->conn);
            } else {
                serverPanic("io_threads_op value is unknown");
            }
        }
        listEmpty(io_threads_list[id]);
        io_threads_pending[id] = 0;
        if (tio_debug) printf("[%ld] Done\n", id);
    }
}

handleClientsWithPendingReadsUsingThreads() 待處理任務分配

/* networking.c: line 2871 */
/* When threaded I/O is also enabled for the reading + parsing side, the readable handler will just put normal clients into a queue of clients to process (instead of serving them synchronously). This function runs the queue using the I/O threads, and process them in order to accumulate the reads in the buffers, and also parse the first command available rendering it in the client structures. */
int handleClientsWithPendingReadsUsingThreads(void) {
    // 是否開啓 線程讀
    if (!io_threads_active || !server.io_threads_do_reads) return 0;
    int processed = listLength(server.clients_pending_read);
    if (processed == 0) return 0;
    if (tio_debug) printf("%d TOTAL READ pending clients\n", processed);
    /* Distribute the clients across N different lists. */
    // 將待處理任務進行分配，分配方式爲 RR (round robin) 即基於任務到達時間片進行分配
    listIter li;
    listNode *ln;
    listRewind(server.clients_pending_read,&li);
    int item_id = 0;
    while((ln = listNext(&li))) {
        client *c = listNodeValue(ln);
        int target_id = item_id % server.io_threads_num;
        listAddNodeTail(io_threads_list[target_id],c);
        item_id++;
    }
    
    /* Give the start condition to the waiting threads, by setting the start condition atomic var. */
    // 設定任務個數參數
    io_threads_op = IO_THREADS_OP_READ;
    for (int j = 0; j < server.io_threads_num; j++) {
        int count = listLength(io_threads_list[j]);
        io_threads_pending[j] = count;
    }
    /* Wait for all threads to end their work. */
    // 等待所有線程任務都處理完畢
    while(1) {
        unsigned long pending = 0;
        for (int j = 0; j < server.io_threads_num; j++)
            pending += io_threads_pending[j];
        if (pending == 0) break;
    }
    if (tio_debug) printf("I/O READ All threads finshed\n");
    /* Run the list of clients again to process the new buffers. */
    // 繼續運行，等待新的處理任務
    listRewind(server.clients_pending_read,&li);
    while((ln = listNext(&li))) {
        client *c = listNodeValue(ln);
        c->flags &= ~CLIENT_PENDING_READ;
        if (c->flags & CLIENT_PENDING_COMMAND) {
            c->flags &= ~ CLIENT_PENDING_COMMAND;
            processCommandAndResetClient(c);
        }
        processInputBufferAndReplicate(c);
    }
    listEmpty(server.clients_pending_read);
    return processed;
}

readQueryFromClient() 函數

/* networking.c: line 1791 */
void readQueryFromClient(connection *conn) {
    client *c = connGetPrivateData(conn);
    int nread, readlen;
    size_t qblen;
    /* Check if we want to read from the client later when exiting from the event loop. This is the case if threaded I/O is enabled. */
    // 加入多線程模型已經啓用
    if (postponeClientRead(c)) return;
    // 如果沒有啓用多線程模型，則走下面繼續處理讀邏輯
    // ....還有後續老邏輯
}

函數 postponeClientRead() 將任務放入處理隊列，而根據上面 IOThreadMain() 和 handleClientsWithPendingReadsUsingThreads() 的任務處理邏輯進行處理

/* networking.c: line 2852 */
int postponeClientRead(client *c) {
    // 如果啓用多線程模型，並且判斷全局配置中是否支持多線程讀
    if (io_threads_active &&
        server.io_threads_do_reads &&
        // 這裏有個點需要注意，如果是 master-slave 同步也有可能被認爲是普通 讀任務，所以需要標識
        !(c->flags & (CLIENT_MASTER|CLIENT_SLAVE|CLIENT_PENDING_READ)))
    {
        c->flags |= CLIENT_PENDING_READ;
        // 將任務放入處理隊列
        listAddNodeHead(server.clients_pending_read,c);
        return 1;
    } else {
        return 0;
    }
}

對比Memcached

前些年memcached 是各大互聯網公司常用的緩存方案，因此redis 和 memcached 的區別基本成了面試官緩存方面必問的面試題，最近幾年memcached用的少了，基本都是 redis。不過隨着Redis6.0加入了多線程特性，類似的問題可能還會出現，接下來我們只針對多線程模型來簡單比較一下它們。

首先看一下Memcached的線程模型：

如上圖所示：Memcached 服務器採用 master-woker 模式進行工作，服務端採用 socket 與客戶端通訊。主線程、工作線程採用 pipe管道進行通訊。主線程採用 libevent 監聽 listen、accept 的讀事件，事件響應後將連接信息的數據結構封裝起來，根據算法選擇合適的工作線程，將連接任務攜帶連接信息分發出去，相應的線程利用連接描述符建立與客戶端的socket連接並進行後續的存取數據操作。

Redis6.0與Memcached多線程模型對比：
相同點：都採用了 master線程-worker 線程的模型
不同點：Memcached 執行主邏輯也是在 worker 線程裏，模型更加簡單，實現了真正的線程隔離，符合我們對線程隔離的常規理解。而 Redis 把處理邏輯交還給 master 線程，雖然一定程度上增加了模型複雜度，但也解決了線程併發安全等問題。

尾巴

大家都會拿Redis和memcached對比，但Redis不是memcached，它只是做到like memcached的多線程，而不是跟memcached一樣的完全隔離的多線程模型。Redis中因爲有lua腳本，事務，Lpush等等複雜性，需要考慮的問題很多，不管怎麼樣，最新版的Redis6帶給我們的IO多線程着實是個驚喜，互聯網大廠們應該很快就會紛紛上線此功能了！

參考

https://ruby-china.org/topics/38957

http://www.web-lovers.com/redis-source-6-rc-mult-thread.html

https://zhuanlan.zhihu.com/p/76788470

http://calixwu.com/2014/11/memcached-yuanmafenxi-xianchengmoxing.html

希望本文對你有幫助，請點個贊鼓勵一下作者吧~ 謝謝！

Redis系列(十六)、Redis6新特性之IO多線程

介紹

爲什麼Redis6.0之前是單線程模型

什麼是IO多線程

開啓IO多線程

性能對比

源碼解析

對比Memcached

尾巴

《日本蠟燭圖》讀書筆記 & 技術分析回測

Python多線程編程深度探索：從入門到實戰

《期貨-市場技術分析》讀書筆記

mongodb處理json數據很好

頂級 Javaer 都在用的 20 個類庫，真香！

[轉帖]cpupower

google瀏覽器插件開發

35K*14 薪，入職了！這公司只要不裁員，我能一直呆下去！

Redis系列(十六)、Redis6新特性之IO多線程

Redis系列(十二)、Redis6集羣搭建及原理(主從、哨兵、集羣)

關於ELK，你們想知道的都在這裏了！(Elasticsearch7.7+Logstash7.7+Kibana7.7)

Redis系列(十三)、pub/sub發佈與訂閱（對比List和Kafka）

Redis系列(十五)、Redis6新特性之集羣代理(Cluster Proxy)

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結