Linux網絡編程：自己動手寫高性能HTTP服務器框架（一）

在開始編寫高性能HTTP服務器之前，先來構建一個支持TCP的高性能網絡編程框架，再增加HTTP特性的支持就比較容易了。

github：https://github.com/froghui/yolanda

需求提出

首先，TCP 高性能網絡框架需要滿足的需求有以下三點：

第一，採用 reactor 模型，可以靈活使用 poll/epoll 作爲事件分發實現。

第二，必須支持多線程，從而可以支持單線程單 reactor 模式，也可以支持多線程主 - 從 reactor 模式。可以將套接字上的 I/O 事件分離到多個線程上。

第三，封裝讀寫操作到 Buffer 對象中。

按照這三個需求，正好可以把整體設計思路分成三塊來講解，分別包括反應堆模式設計、I/O 模型和多線程模型設計、數據讀寫封裝和 buffer。

反應堆模式設計思路

主要是設計一個基於事件分發和回調的反應堆框架。這個框架裏面的主要對象包括：

event_loop

你可以把 event_loop 這個對象理解成和一個線程綁定的無限事件循環，你會在各種語言裏看到 event_loop 這個抽象。這是什麼意思呢？簡單來說，它就是一個無限循環着的事件分發器，一旦有事件發生，它就會回調預先定義好的回調函數，完成事件的處理。

具體來說，event_loop 使用 poll 或者 epoll 方法將一個線程阻塞，等待各種 I/O 事件的發生。

channel

對各種註冊到 event_loop 上的對象，我們抽象成 channel 來表示，例如註冊到 event_loop 上的監聽事件、套接字讀寫事件等。在各種語言的 API 裏，你都會看到 channel 這個對象，大體上它們表達的意思跟我們這裏的設計思路是比較一致的。

acceptor

acceptor 對象表示的是服務器端監聽器，acceptor 對象最終會作爲一個 channel 對象，註冊到 event_loop 上，以便進行連接完成的事件分發和檢測。

event_dispatcher

event_dispatcher 是對事件分發機制的一種抽象，也就是說，可以實現一個基於 poll 的 poll_dispatcher，也可以實現一個基於 epoll 的 epoll_dispatcher。在這裏，我們統一設計一個 event_dispatcher 結構體，來抽象這些行爲。

channel_map

channel_map 保存了描述字到 channel 的映射，這樣就可以在事件發生時，根據事件類型對應的套接字快速找到 chanel 對象裏的事件處理函數。

I/O 模型和多線程模型設計思路

主要解決 event_loop 的線程運行問題，以及事件分發和回調的線程執行問題。

thread_pool

struct thread_pool {
    //創建thread_pool的主線程
    struct event_loop *mainLoop;
    //是否已經啓動
    int started;
    //線程數目
    int thread_number;
    //數組指針，指向創建的event_loop_thread數組
    struct event_loop_thread *eventLoopThreads;
    //表示在數組裏的位置，用來決定選擇哪個event_loop_thread服務
    int position;

};

struct thread_pool *thread_pool_new(struct event_loop *mainLoop, int threadNumber);
void thread_pool_start(struct thread_pool *);
struct event_loop *thread_pool_get_loop(struct thread_pool *);

thread_pool 維護了一個 sub-reactor 的線程列表，它可以提供給主 reactor 線程使用，每次當有新的連接建立時，可以從 thread_pool 裏獲取一個線程，以便用它來完成對新連接套接字的 read/write 事件註冊，將 I/O 線程和主 reactor 線程分離。

event_loop_thread

struct event_loop_thread {
    struct event_loop *eventLoop;
    pthread_t thread_tid;        /* thread ID */
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    char * thread_name;
    long thread_count;    /* # connections handled */
};

//初始化已經分配內存的event_loop_thread
int event_loop_thread_init(struct event_loop_thread *, int);
//由主線程調用，初始化一個子線程，並且讓子線程開始運行event_loop
struct event_loop *event_loop_thread_start(struct event_loop_thread *);

event_loop_thread 是 reactor 的線程實現，連接套接字的 read/write 事件檢測都是在這個線程裏完成的。

Buffer 和數據讀寫的設計思路

buffer

#define INIT_BUFFER_SIZE 65536
//數據緩衝區
struct buffer {
    char *data;          //實際緩衝
    int readIndex;       //緩衝讀取位置
    int writeIndex;      //緩衝寫入位置
    int total_size;      //總大小
};

struct buffer *buffer_new();
void buffer_free(struct buffer *buffer);
int buffer_writeable_size(struct buffer *buffer);
int buffer_readable_size(struct buffer *buffer);
int buffer_front_spare_size(struct buffer *buffer);

//往buffer裏寫數據
int buffer_append(struct buffer *buffer, void *data, int size);
//往buffer裏寫數據
int buffer_append_char(struct buffer *buffer, char data);
//往buffer裏寫數據
int buffer_append_string(struct buffer*buffer, char * data);
//讀socket數據，往buffer裏寫
int buffer_socket_read(struct buffer *buffer, int fd);
//讀buffer數據
char buffer_read_char(struct buffer *buffer);
//查詢buffer數據
char * buffer_find_CRLF(struct buffer * buffer);

buffer 對象屏蔽了對套接字進行的寫和讀的操作，如果沒有 buffer 對象，連接套接字的 read/write 事件都需要和字節流直接打交道，這顯然是不友好的。所以，我們也提供了一個基本的 buffer 對象，用來表示從連接套接字收取的數據，以及應用程序即將需要發送出去的數據。

tcp_connection

struct tcp_connection {
    struct event_loop *eventLoop;
    struct channel *channel;
    char *name;
    struct buffer *input_buffer;   //接收緩衝區
    struct buffer *output_buffer;  //發送緩衝區

    connection_completed_call_back connectionCompletedCallBack;
    message_call_back messageCallBack;
    write_completed_call_back writeCompletedCallBack;
    connection_closed_call_back connectionClosedCallBack;

    void * data; //for callback use: http_server
    void * request; // for callback use
    void * response; // for callback use
};

struct tcp_connection *
tcp_connection_new(int fd, struct event_loop *eventLoop, connection_completed_call_back 
    connectionCompletedCallBack, connection_closed_call_back connectionClosedCallBack,
    message_call_back messageCallBack, write_completed_call_back writeCompletedCallBack);

//應用層調用入口
int tcp_connection_send_data(struct tcp_connection *tcpConnection, void *data, int size);
//應用層調用入口
int tcp_connection_send_buffer(struct tcp_connection *tcpConnection, struct buffer * buffer);
void tcp_connection_shutdown(struct tcp_connection * tcpConnection);

tcp_connection 這個對象描述的是已建立的 TCP 連接。它的屬性包括接收緩衝區、發送緩衝區、channel 對象等。這些都是一個 TCP 連接的天然屬性。tcp_connection 是大部分應用程序和我們的高性能框架直接打交道的數據結構。我們不想把最下層的 channel 對象暴露給應用程序，因爲抽象的 channel 對象不僅僅可以表示 tcp_connection，前面提到的監聽套接字也是一個 channel 對象，後面提到的喚醒 socketpair 也是一個 channel 對象。所以，我們設計了 tcp_connection 這個對象，希望可以提供給用戶比較清晰的編程入口。

反應堆模式具體設計

event_loop運行詳圖：

當 event_loop_run 完成之後，線程進入循環，首先執行 dispatch 事件分發，一旦有事件發生，就會調用 channel_event_activate 函數，在這個函數中完成事件回調函數 eventReadcallback 和 eventWritecallback 的調用，最後再運行event_loop_handle_pending_channel，用來修改當前監聽的事件列表，完成這個部分之後，又進入了事件分發循環。

event_loop 分析

說 event_loop 是整個反應堆模式設計的核心，一點也不爲過。先看一下 event_loop 的數據結構。在這個數據結構中，最重要的莫過於 event_dispatcher 對象了。你可以簡單地把 event_dispatcher 理解爲 poll 或者 epoll，它可以讓我們的線程掛起，等待事件的發生。這裏有一個小技巧，就是 event_dispatcher_data，它被定義爲一個 void * 類型，可以按照我們的需求，任意放置一個我們需要的對象指針。這樣，針對不同的實現，例如 poll 或者 epoll，都可以根據需求，放置不同的數據對象。event_loop 中還保留了幾個跟多線程有關的對象，如 owner_thread_id 是保留了每個 event loop 的線程 ID，mutex 和 con 是用來進行線程同步的。socketPair 是父線程用來通知子線程有新的事件需要處理。pending_head 和 pending_tail 是保留在子線程內的需要處理的新的事件。

struct event_loop {
    int quit;
    const struct event_dispatcher *eventDispatcher;

    /** 對應的event_dispatcher的數據. */
    void *event_dispatcher_data;
    struct channel_map *channelMap;

    int is_handle_pending;
    struct channel_element *pending_head;
    struct channel_element *pending_tail;

    pthread_t owner_thread_id;
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    int socketPair[2];
    char *thread_name;
};

下面我們看一下 event_loop 最主要的方法 event_loop_run 方法，前面提到過，event_loop 就是一個無限 while 循環，不斷地在分發事件。

/**
 * 1.參數驗證
 * 2.調用dispatcher來進行事件分發,分發完回調事件處理函數
 */
int event_loop_run(struct event_loop *eventLoop) {
    assert(eventLoop != NULL);

    struct event_dispatcher *dispatcher = eventLoop->eventDispatcher;
    if (eventLoop->owner_thread_id != pthread_self()) {
        exit(1);
    }

    yolanda_msgx("event loop run, %s", eventLoop->thread_name);
    struct timeval timeval;
    timeval.tv_sec = 1;

    while (!eventLoop->quit) {
        //block here to wait I/O event, and get active channels
        dispatcher->dispatch(eventLoop, &timeval);
        //handle the pending channel
        event_loop_handle_pending_channel(eventLoop);
    }

    yolanda_msgx("event loop end, %s", eventLoop->thread_name);
    return 0;
}

代碼很明顯地反映了這一點，這裏我們在 event_loop 不退出的情況下，一直在循環，循環體中調用了 dispatcher 對象的 dispatch 方法來等待事件的發生。

event_dispacher 分析

爲了實現不同的事件分發機制，這裏把 poll、epoll 等抽象成了一個 event_dispatcher 結構。event_dispatcher 的具體實現有 poll_dispatcher 和 epoll_dispatcher 兩種。

/** 抽象的event_dispatcher結構體，對應的實現如select,poll,epoll等I/O複用. */
struct event_dispatcher {
    /**  對應實現 */
    const char *name;

    /**  初始化函數 */
    void *(*init)(struct event_loop * eventLoop);

    /** 通知dispatcher新增一個channel事件*/
    int (*add)(struct event_loop * eventLoop, struct channel * channel);

    /** 通知dispatcher刪除一個channel事件*/
    int (*del)(struct event_loop * eventLoop, struct channel * channel);

    /** 通知dispatcher更新channel對應的事件*/
    int (*update)(struct event_loop * eventLoop, struct channel * channel);

    /** 實現事件分發，然後調用event_loop的event_activate方法執行callback*/
    int (*dispatch)(struct event_loop * eventLoop, struct timeval *);

    /** 清除數據 */
    void (*clear)(struct event_loop * eventLoop);
};

channel 對象分析

channel 對象是用來和 event_dispather 進行交互的最主要的結構體，它抽象了事件分發。一個 channel 對應一個描述字，描述字上可以有 READ 可讀事件，也可以有 WRITE 可寫事件。channel 對象綁定了事件處理函數 event_read_callback 和 event_write_callback。

typedef int (*event_read_callback)(void *data);
typedef int (*event_write_callback)(void *data);

struct channel {
    int fd;
    int events;   //表示event類型

    event_read_callback eventReadCallback;
    event_write_callback eventWriteCallback;
    void *data; //callback data, 可能是event_loop，也可能是tcp_server或者tcp_connection
};

channel_map 對象分析

event_dispatcher 在獲得活動事件列表之後，需要通過文件描述字找到對應的 channel，從而回調 channel 上的事件處理函數 event_read_callback 和 event_write_callback，爲此，設計了 channel_map 對象。

/**
 * channel映射表, key爲對應的socket描述字
 */
struct channel_map {
    void **entries;

    /* The number of entries available in entries */
    int nentries;
};

channel_map 對象是一個數組，數組的下標即爲描述字，數組的元素爲 channel 對象的地址。比如描述字 3 對應的 channel，就可以這樣直接得到。

struct chanenl * channel = map->entries[3];

這樣，當 event_dispatcher 需要回調 channel 上的讀、寫函數時，調用 channel_event_activate 就可以，下面是 channel_event_activate 的實現，在找到了對應的 channel 對象之後，根據事件類型，回調了讀函數或者寫函數。注意，這裏使用了 EVENT_READ 和 EVENT_WRITE 來抽象了 poll 和 epoll 的所有讀寫事件類型。

int channel_event_activate(struct event_loop *eventLoop, int fd, int revents) {
    struct channel_map *map = eventLoop->channelMap;
    yolanda_msgx("activate channel fd == %d, revents=%d, %s", fd, revents, eventLoop->thread_name);

    if (fd < 0)
        return 0;
    if (fd >= map->nentries)
        return (-1);

    struct channel *channel = map->entries[fd];
    assert(fd == channel->fd);
    if (revents & (EVENT_READ)) {
        if (channel->eventReadCallback) 
            channel->eventReadCallback(channel->data);
    }
    if (revents & (EVENT_WRITE)) {
        if (channel->eventWriteCallback) 
            channel->eventWriteCallback(channel->data);
    }
    return 0;
}

增加、刪除、修改 channel event

那麼如何增加新的 channel event 事件呢？這幾個函數是用來增加、刪除和修改 channel event 事件的。

int event_loop_add_channel_event(struct event_loop *eventLoop, int fd, struct channel *channel1);
int event_loop_remove_channel_event(struct event_loop *eventLoop, int fd, struct channel *channel1);
int event_loop_update_channel_event(struct event_loop *eventLoop, int fd, struct channel *channel1);

前面三個函數提供了入口能力，而真正的實現則落在這三個函數上：

int event_loop_handle_pending_add(struct event_loop *eventLoop, int fd, struct channel *channel);
int event_loop_handle_pending_remove(struct event_loop *eventLoop, int fd, struct channel *channel);
int event_loop_handle_pending_update(struct event_loop *eventLoop, int fd, struct channel *channel);

我們看一下其中的一個實現，event_loop_handle_pendign_add 在當前 event_loop 的 channel_map 裏增加一個新的 key-value 對，key 是文件描述字，value 是 channel 對象的地址。之後調用 event_dispatcher 對象的 add 方法增加 channel event 事件。注意這個方法總在當前的 I/O 線程中執行。

// in the i/o thread
int event_loop_handle_pending_add(struct event_loop *eventLoop, int fd, struct channel *channel) {
    yolanda_msgx("add channel fd == %d, %s", fd, eventLoop->thread_name);
    struct channel_map *map = eventLoop->channelMap;
    if (fd < 0)
        return 0;
    if (fd >= map->nentries) {
        if (map_make_space(map, fd, sizeof(struct channel *)) == -1)
            return (-1);
    }

    //第一次創建，增加
    if ((map)->entries[fd] == NULL) {
        map->entries[fd] = channel;
        //add channel
        struct event_dispatcher *eventDispatcher = eventLoop->eventDispatcher;
        eventDispatcher->add(eventLoop, channel);
        return 1;
    }
    return 0;
}

總結

在這一講裏，我們介紹了高性能網絡編程框架的主要設計思路和基本數據結構，以及反應堆設計相關的具體做法。

溫故而知新！

Linux網絡編程：自己動手寫高性能HTTP服務器框架（一）

Linux網絡編程：自己動手寫高性能HTTP服務器框架（一）

Linux網絡編程 - 使用套接字格式建立連接以及數據交互

Linux網絡編程 - C10K問題：高併發模型的設計初篇

Linux網絡編程 - TIME_WAIT

“快、準、狠” 地找到系統內存瓶頸

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結