nginx源碼分析—hash結構ngx_hash_t

Content

0.序

1.hash結構

1.1ngx_hash_t結構

1.2ngx_hash_init_t結構

1.3ngx_hash_key_t結構

1.4hash的邏輯結構

2.hash操作

2.1NGX_HASH_ELT_SIZE宏

2.2hash函數

2.3hash初始化

2.4hash查找

3.一個例子

3.1代碼

3.2如何編譯

3.3運行結果

3.3.1bucket_size=64字節

3.3.2bucket_size=256字節

4.小結

 

0. 序

 

本文繼續介紹nginx的數據結構——hash結構。

 

鏈表實現文件:文件:./src/core/ngx_hash.h/.c。.表示nginx-1.0.4代碼目錄,本文爲/usr/src/nginx-1.0.4。

 

1. hash結構

 

nginx的hash結構比其list、array、queue等結構稍微複雜一些,下圖是hash相關數據結構圖。下面一一介紹。

 

1.1 ngx_hash_t結構

 

nginx的hash結構爲ngx_hash_t,hash元素結構爲ngx_hash_elt_t,定義如下。

typedef struct {               //hash元素結構
    void             *value;   //value,即某個key對應的值,即<key,value>中的value
    u_short           len;     //name長度
    u_char            name[1]; //某個要hash的數據(在nginx中表現爲字符串),即<key,value>中的key
} ngx_hash_elt_t;
 
typedef struct {               //hash結構
    ngx_hash_elt_t  **buckets; //hash桶(有size個桶)
    ngx_uint_t        size;    //hash桶個數
 
} ngx_hash_t;

其中,sizeof(ngx_hash_t) = 8,sizeof(ngx_hash_elt_t) = 8。實際上,ngx_hash_elt_t結構中的name字段就是ngx_hash_key_t結構中的key。這在ngx_hash_init()函數中可以看到,請參考後續的分析。該結構在模塊配置解析時經常使用。

 

1.2 ngx_hash_init_t結構

 

nginx的hash初始化結構是ngx_hash_init_t,用來將其相關數據封裝起來作爲參數傳遞給ngx_hash_init()或ngx_hash_wildcard_init()函數。這兩個函數主要是在http相關模塊中使用,例如ngx_http_server_names()函數(優化http Server Names),ngx_http_merge_types()函數(合併httptype),ngx_http_fastcgi_merge_loc_conf()函數(合併FastCGI Location Configuration)等函數或過程用到的參數、局部對象/變量等。這些內容將在後續的文章中講述。

 

ngx_hash_init_t結構如下。sizeof(ngx_hash_init_t)=28。

typedef struct {                    //hash初始化結構
    ngx_hash_t       *hash;         //指向待初始化的hash結構
    ngx_hash_key_pt   key;          //hash函數指針
 
    ngx_uint_t        max_size;     //bucket的最大個數
    ngx_uint_t        bucket_size;  //每個bucket的空間
 
    char             *name;         //該hash結構的名字(僅在錯誤日誌中使用)
    ngx_pool_t       *pool;         //該hash結構從pool指向的內存池中分配
    ngx_pool_t       *temp_pool;    //分配臨時數據空間的內存池
} ngx_hash_init_t;


1.3 ngx_hash_key_t結構

 

該結構也主要用來保存要hash的數據,即鍵-值對<key,value>,在實際使用中,一般將多個鍵-值對保存在ngx_hash_key_t結構的數組中,作爲參數傳給ngx_hash_init()或ngx_hash_wildcard_init()函數。其定義如下。

typedef struct {                    //hash key結構
    ngx_str_t         key;          //key,爲nginx的字符串結構
    ngx_uint_t        key_hash;     //由該key計算出的hash值(通過hash函數如ngx_hash_key_lc())
    void             *value;        //該key對應的值,組成一個鍵-值對<key,value>
} ngx_hash_key_t;
 
typedef struct {                    //字符串結構
    size_t      len;                //字符串長度
    u_char     *data;               //字符串內容
} ngx_str_t;


其中,sizeof(ngx_hash_key_t) = 16。一般在使用中,value指針可能指向靜態數據區(例如全局數組、常量字符串)、堆區(例如動態分配的數據區用來保存value值)等。可參考本文後面的例子。

 

關於ngx_table_elt_t結構和ngx_hash_keys_arrays_t結構,因其對於hash結構本身沒有太大作用,主要是爲模塊配置、referer合法性驗證等設計的數據結構,例如http的core模塊、map模塊、referer模塊、SSI filter模塊等,此處不再講述,將在後續的文章中介紹。

1.4 hash的邏輯結構

ngx_hash_init_t結構引用了ngx_pool_t結構,因此本文參考nginx-1.0.4源碼分析—內存池結構ngx_pool_t及內存管理一文畫出相關結構的邏輯圖,如下。注:本文采用UML的方式畫出該圖。


2. hash操作

 

2.1 NGX_HASH_ELT_SIZE宏

 

NGX_HASH_ELT_SIZE宏用來計算上述ngx_hash_elt_t結構大小,定義如下。

#define NGX_HASH_ELT_SIZE(name)         \      //該參數name即爲ngx_hash_elt_t結構指針
    (sizeof(void *) + ngx_align((name)->key.len + 2, sizeof(void *))) //以4字節對齊


在32位平臺上,sizeof(void*)=4,(name)->key.len即是ngx_hash_elt_t結構中name數組保存的內容的長度,其中的"+2"是要加上該結構中len字段(u_short類型)的大小。

 

因此,NGX_HASH_ELT_SIZE(name)=4+ngx_align((name)->key.len + 2, 4),該式後半部分即是(name)->key.len+2以4字節對齊的大小。

 

2.2 hash函數

 

nginx-1.0.4提供的hash函數有以下幾種。

 

#define ngx_hash(key, c)   ((ngx_uint_t) key * 31 + c)  //hash宏
ngx_uint_t ngx_hash_key(u_char *data, size_t len);
ngx_uint_t ngx_hash_key_lc(u_char *data, size_t len);   //lc表示lower case,即字符串轉換爲小寫後再計算hash值
ngx_uint_t ngx_hash_strlow(u_char *dst, u_char *src, size_t n);

hash函數都很簡單,以上3個函數都會調用ngx_hash宏,該宏返回一個(長)整數。此處介紹第一個函數,定義如下。

ngx_uint_t
ngx_hash_key(u_char *data, size_t len)
{
    ngx_uint_t  i, key;
 
    key = 0;
 
    for (i = 0; i < len; i++) {
        key = ngx_hash(key, data[i]);
    }
 
    return key;
}


因此,ngx_hash_key函數的計算可表述爲下列公式。

Key[0] = data[0]
Key[1] = data[0]*31 + data[1]
Key[2] = (data[0]*31 + data[1])*31 + data[2]
...
Key[len-1] = ((((data[0]*31 + data[1])*31 + data[2])*31) ... data[len-2])*31 + data[len-1]


key[len-1]即爲傳入的參數data對應的hash值。

 

2.3 hash初始化

 

hash初始化由ngx_hash_init()函數完成,其names參數是ngx_hash_key_t結構的數組,即鍵-值對<key,value>數組,nelts表示該數組元素的個數。因此,在調用該函數進行初始化之前,ngx_hash_key_t結構的數組是準備好的,如何使用,可以採用nginx的ngx_array_t結構,詳見本文後面的例子。

 

該函數初始化的結果就是將names數組保存的鍵-值對<key,value>,通過hash的方式將其存入相應的一個或多個hash桶(即代碼中的buckets)中,該hash過程用到的hash函數一般爲ngx_hash_key_lc等。hash桶裏面存放的是ngx_hash_elt_t結構的指針(hash元素指針),該指針指向一個基本連續的數據區。該數據區中存放的是經hash之後的鍵-值對<key',value'>,即ngx_hash_elt_t結構中的字段<name,value>。每一個這樣的數據區存放的鍵-值對<key',value'>可以是一個或多個。

 

此處有幾個問題需要說明。

 

問題1:爲什麼說是基本連續?

——用NGX_HASH_ELT_SIZE宏計算某個hash元素的總長度時,存在以sizeof(void*)對齊的填補(padding)。因此將names數組中的鍵-值對<key,value>中的key拷貝到ngx_hash_elt_t結構的name[1]數組中時,已經爲該hash元素分配的空間不會完全被用完,故這個數據區是基本連續的。這一點也可以參考本節後面的結構圖或本文後面的例子。

 

問題2:這些基本連續的數據區從哪裏分配的?

——當然是從該函數的第一個參數ngx_hash_init_t的pool字段指向的內存池中分配的。

 

問題3:<key',value'>與<key,value>不同的是什麼?

——key保存的僅僅是個指針,而key'卻是key拷貝到name[1]的結果。而value和value'都是指針。如1.3節說明,value指針可能指向靜態數據區(例如全局數組、常量字符串)、堆區(例如動態分配的數據區用來保存value值)等。可參考本文後面的例子。

 

問題4:如何知道某個鍵-值對<key,value>放在哪個hash桶中?

——key = names[n].key_hash % size; 代碼中的這個計算是也。計算結果key即是該鍵要放在那個hash桶的編號(從0到size-1)。

 

該函數代碼如下。一些疑點、難點的解釋請參考//後筆者所加的註釋,也可參考本節的hash結構圖。

 

//nelts是names數組中(實際)元素的個數
ngx_int_t
ngx_hash_init(ngx_hash_init_t *hinit, ngx_hash_key_t *names, ngx_uint_t nelts)
{
    u_char          *elts;
    size_t           len;
    u_short         *test;
    ngx_uint_t       i, n, key, size, start, bucket_size;
    ngx_hash_elt_t  *elt, **buckets;
 
    for (n = 0; n < nelts; n++) {  //檢查names數組的每一個元素,判斷桶的大小是否夠分配
        if (hinit->bucket_size < NGX_HASH_ELT_SIZE(&names[n]) + sizeof(void *))
        {   //有任何一個元素,桶的大小不夠爲該元素分配空間,則退出
            ngx_log_error(NGX_LOG_EMERG, hinit->pool->log, 0,
                          "could not build the %s, you should "
                          "increase %s_bucket_size: %i",
                          hinit->name, hinit->name, hinit->bucket_size);
            return NGX_ERROR;
        }
    }
 
    //分配2*max_size個字節的空間保存hash數據(該內存分配操作不在nginx的內存池中進行,因爲test只是臨時的)
    test = ngx_alloc(hinit->max_size * sizeof(u_short), hinit->pool->log);
    if (test == NULL) {
        return NGX_ERROR;
    }
 
    bucket_size = hinit->bucket_size - sizeof(void *); //一般sizeof(void*)=4
 
    start = nelts / (bucket_size / (2 * sizeof(void *))); //
    start = start ? start : 1;
 
    if (hinit->max_size > 10000 && hinit->max_size / nelts < 100) {
        start = hinit->max_size - 1000;
    }
 
    for (size = start; size < hinit->max_size; size++) {
 
        ngx_memzero(test, size * sizeof(u_short));
 
        //標記1:此塊代碼是檢查bucket大小是否夠分配hash數據
        for (n = 0; n < nelts; n++) {
            if (names[n].key.data == NULL) {
                continue;
            }
 
            //計算key和names中所有name長度,並保存在test[key]中
            key = names[n].key_hash % size; //若size=1,則key一直爲0
            test[key] = (u_short) (test[key] + NGX_HASH_ELT_SIZE(&names[n]));
 
            if (test[key] > (u_short) bucket_size) {//若超過了桶的大小,則到下一個桶重新計算
                goto next;
            }
        }
 
        goto found;
 
    next:
 
        continue;
    }
 
    //若沒有找到合適的bucket,退出
    ngx_log_error(NGX_LOG_EMERG, hinit->pool->log, 0,
                  "could not build the %s, you should increase "
                  "either %s_max_size: %i or %s_bucket_size: %i",
                  hinit->name, hinit->name, hinit->max_size,
                  hinit->name, hinit->bucket_size);
 
    ngx_free(test);
 
    return NGX_ERROR;
 
found:  //找到合適的bucket
 
    for (i = 0; i < size; i++) {  //將test數組前size個元素初始化爲4
        test[i] = sizeof(void *);
    }
 
    /** 標記2:與標記1代碼基本相同,但此塊代碼是再次計算所有hash數據的總長度(標記1的檢查已通過)
        但此處的test[i]已被初始化爲4,即相當於後續的計算再加上一個void指針的大小。
     */
    for (n = 0; n < nelts; n++) {
        if (names[n].key.data == NULL) {
            continue;
        }
 
        //計算key和names中所有name長度,並保存在test[key]中
        key = names[n].key_hash % size; //若size=1,則key一直爲0
        test[key] = (u_short) (test[key] + NGX_HASH_ELT_SIZE(&names[n]));
    }
 
     //計算hash數據的總長度
    len = 0;
 
    for (i = 0; i < size; i++) {
        if (test[i] == sizeof(void *)) {//若test[i]仍爲初始化的值4,即沒有變化,則繼續
            continue;
        }
 
        //對test[i]按ngx_cacheline_size對齊(32位平臺,ngx_cacheline_size=32)
        test[i] = (u_short) (ngx_align(test[i], ngx_cacheline_size));
 
        len += test[i];
    }
 
    if (hinit->hash == NULL) {//在內存池中分配hash頭及buckets數組(size個ngx_hash_elt_t*結構)
        hinit->hash = ngx_pcalloc(hinit->pool, sizeof(ngx_hash_wildcard_t)
            + size * sizeof(ngx_hash_elt_t *));
        if (hinit->hash == NULL) {
            ngx_free(test);
            return NGX_ERROR;
        }
 
        //計算buckets的啓示位置(在ngx_hash_wildcard_t結構之後)
        buckets = (ngx_hash_elt_t **)
            ((u_char *) hinit->hash + sizeof(ngx_hash_wildcard_t));
 
    } else {  //在內存池中分配buckets數組(size個ngx_hash_elt_t*結構)
        buckets = ngx_pcalloc(hinit->pool, size * sizeof(ngx_hash_elt_t *));
        if (buckets == NULL) {
            ngx_free(test);
            return NGX_ERROR;
        }
    }
 
    //接着分配elts,大小爲len+ngx_cacheline_size,此處爲什麼+32?——下面要按32字節對齊
    elts = ngx_palloc(hinit->pool, len + ngx_cacheline_size);
    if (elts == NULL) {
        ngx_free(test);
        return NGX_ERROR;
    }
 
     //將elts地址按ngx_cacheline_size=32對齊
    elts = ngx_align_ptr(elts, ngx_cacheline_size);
 
    for (i = 0; i < size; i++) {  //將buckets數組與相應elts對應起來
        if (test[i] == sizeof(void *)) {
            continue;
        }
 
        buckets[i] = (ngx_hash_elt_t *) elts;
        elts += test[i];
 
    }
 
    for (i = 0; i < size; i++) {  //test數組置0
        test[i] = 0;
    }
 
    for (n = 0; n < nelts; n++) { //將傳進來的每一個hash數據存入hash表
        if (names[n].key.data == NULL) {
            continue;
        }
 
        //計算key,即將被hash的數據在第幾個bucket,並計算其對應的elts位置
        key = names[n].key_hash % size;
        elt = (ngx_hash_elt_t *) ((u_char *) buckets[key] + test[key]);
 
        //對ngx_hash_elt_t結構賦值
        elt->value = names[n].value;
        elt->len = (u_short) names[n].key.len;
 
        ngx_strlow(elt->name, names[n].key.data, names[n].key.len);
 
        //計算下一個要被hash的數據的長度偏移
        test[key] = (u_short) (test[key] + NGX_HASH_ELT_SIZE(&names[n]));
    }
 
    for (i = 0; i < size; i++) {
        if (buckets[i] == NULL) {
            continue;
        }
 
        //test[i]相當於所有被hash的數據總長度
        elt = (ngx_hash_elt_t *) ((u_char *) buckets[i] + test[i]);
 
        elt->value = NULL;
    }
 
    ngx_free(test);  //釋放該臨時空間
 
    hinit->hash->buckets = buckets;
    hinit->hash->size = size;
 
    return NGX_OK;
}


所謂的hash數據長度即指ngx_hash_elt_t結構被賦值後的長度。nelts個元素存放在names數組中,調用該函數對hash進行初始化之後,這nelts個元素被保存在size個hash桶指向的ngx_hash_elts_t數據區,這些數據區中共保存了nelts個hash元素。即hash桶(buckets)存放的是ngx_hash_elt_t數據區的起始地址,以該起始地址開始的數據區存放的是經hash之後的hash元素,每個hash元素的最後是以name[0]爲開始的字符串,該字符串就是names數組中某個元素的key,即鍵值對<key,value>中的key,然後該字符串之後會有幾個字節的因對齊產生的padding。

 

一個典型的經初始化後的hash物理結構如下。具體的可參考後文的例子。


2.4 hash查找

 

hash查找操作由ngx_hash_find()函數完成,代碼如下。//後的註釋爲筆者所加。

//由key,name,len信息在hash指向的hash table中查找該key對應的value
void *
ngx_hash_find(ngx_hash_t *hash, ngx_uint_t key, u_char *name, size_t len)
{
    ngx_uint_t       i;
    ngx_hash_elt_t  *elt;
 
    elt = hash->buckets[key % hash->size];//由key找到所在的bucket(該bucket中保存其elts地址)
 
    if (elt == NULL) {
        return NULL;
    }
 
    while (elt->value) {
        if (len != (size_t) elt->len) {  //先判斷長度
            goto next;
        }
 
        for (i = 0; i < len; i++) {
            if (name[i] != elt->name[i]) {  //接着比較name的內容(此處按字符匹配)
                goto next;
            }
        }
 
        return elt->value;  //匹配成功,直接返回該ngx_hash_elt_t結構的value字段
 
    next:
        //注意此處從elt->name[0]地址處向後偏移,故偏移只需加該elt的len即可,然後在以4字節對齊
        elt = (ngx_hash_elt_t *) ngx_align_ptr(&elt->name[0] + elt->len,
                                               sizeof(void *));
        continue;
    }
 
    return NULL;
}


查找操作相當簡單,由key直接計算所在的bucket,該bucket中保存其所在ngx_hash_elt_t數據區的起始地址;然後根據長度判斷並用name內容匹配,匹配成功,其ngx_hash_elt_t結構的value字段即是所求。

 

3. 一個例子

 

本節給出一個創建內存池並從中分配hash結構、hash桶、hash元素並將鍵-值對<key,value>加入該hash結構的簡單例子。

 

在該例中,將完成這樣一個應用,將給定的多個url及其ip組成的二元組<url,ip>作爲<key,value>,初始化時對這些<url,ip>進行hash,然後根據給定的url查找其對應的ip地址,若沒有找到,則給出相關提示信息。以此向讀者展示nginx的hash使用方法。

3.1代碼

/**
 * ngx_hash_t test
 * in this example, it will first save URLs into the memory pool, and IPs saved in static memory.
 * then, give some examples to find IP according to a URL.
 */
 
#include <stdio.h>
#include "ngx_config.h"
#include "ngx_conf_file.h"
#include "nginx.h"
#include "ngx_core.h"
#include "ngx_string.h"
#include "ngx_palloc.h"
#include "ngx_array.h"
#include "ngx_hash.h"
 
#define Max_Num 7
#define Max_Size 1024
#define Bucket_Size 64  //256, 64
 
#define NGX_HASH_ELT_SIZE(name)               \
    (sizeof(void *) + ngx_align((name)->key.len + 2, sizeof(void *)))
 
/* for hash test */
static ngx_str_t urls[Max_Num] = {
    ngx_string("www.baidu.com"),  //220.181.111.147
    ngx_string("www.sina.com.cn"),  //58.63.236.35
    ngx_string("www.google.com"),  //74.125.71.105
    ngx_string("www.qq.com"),  //60.28.14.190
    ngx_string("www.163.com"),  //123.103.14.237
    ngx_string("www.sohu.com"),  //219.234.82.50
    ngx_string("www.abo321.org")  //117.40.196.26
};
 
static char* values[Max_Num] = {
    "220.181.111.147",
    "58.63.236.35",
    "74.125.71.105",
    "60.28.14.190",
    "123.103.14.237",
    "219.234.82.50",
    "117.40.196.26"
};
 
#define Max_Url_Len 15
#define Max_Ip_Len 15
 
#define Max_Num2 2
 
/* for finding test */
static ngx_str_t urls2[Max_Num2] = {
    ngx_string("www.china.com"),  //60.217.58.79
    ngx_string("www.csdn.net")  //117.79.157.242
};
 
ngx_hash_t* init_hash(ngx_pool_t *pool, ngx_array_t *array);
void dump_pool(ngx_pool_t* pool);
void dump_hash_array(ngx_array_t* a);
void dump_hash(ngx_hash_t *hash, ngx_array_t *array);
ngx_array_t* add_urls_to_array(ngx_pool_t *pool);
void find_test(ngx_hash_t *hash, ngx_str_t addr[], int num);
 
/* for passing compiling */
volatile ngx_cycle_t  *ngx_cycle;
void ngx_log_error_core(ngx_uint_t level, ngx_log_t *log, ngx_err_t err, const char *fmt, ...)
{
}
 
int main(/* int argc, char **argv */)
{
    ngx_pool_t *pool = NULL;
    ngx_array_t *array = NULL;
    ngx_hash_t *hash;
 
    printf("--------------------------------\n");
    printf("create a new pool:\n");
    printf("--------------------------------\n");
    pool = ngx_create_pool(1024, NULL);
 
    dump_pool(pool);
 
    printf("--------------------------------\n");
    printf("create and add urls to it:\n");
    printf("--------------------------------\n");
    array = add_urls_to_array(pool);  //in fact, here should validate array
    dump_hash_array(array);
 
    printf("--------------------------------\n");
    printf("the pool:\n");
    printf("--------------------------------\n");
    dump_pool(pool);
 
    hash = init_hash(pool, array);
    if (hash == NULL)
    {
        printf("Failed to initialize hash!\n");
        return -1;
    }
 
    printf("--------------------------------\n");
    printf("the hash:\n");
    printf("--------------------------------\n");
    dump_hash(hash, array);
    printf("\n");
 
    printf("--------------------------------\n");
    printf("the pool:\n");
    printf("--------------------------------\n");
    dump_pool(pool);
 
    //find test
    printf("--------------------------------\n");
    printf("find test:\n");
    printf("--------------------------------\n");
    find_test(hash, urls, Max_Num);
    printf("\n");
 
    find_test(hash, urls2, Max_Num2);
 
    //release
    ngx_array_destroy(array);
    ngx_destroy_pool(pool);
 
    return 0;
}
 
ngx_hash_t* init_hash(ngx_pool_t *pool, ngx_array_t *array)
{
    ngx_int_t result;
    ngx_hash_init_t hinit;
 
    ngx_cacheline_size = 32;  //here this variable for nginx must be defined
    hinit.hash = NULL;  //if hinit.hash is NULL, it will alloc memory for it in ngx_hash_init
    hinit.key = &ngx_hash_key_lc;  //hash function
    hinit.max_size = Max_Size;
    hinit.bucket_size = Bucket_Size;
    hinit.name = "my_hash_sample";
    hinit.pool = pool;  //the hash table exists in the memory pool
    hinit.temp_pool = NULL;
 
    result = ngx_hash_init(&hinit, (ngx_hash_key_t*)array->elts, array->nelts);
    if (result != NGX_OK)
        return NULL;
 
    return hinit.hash;
}
 
void dump_pool(ngx_pool_t* pool)
{
    while (pool)
    {
        printf("pool = 0x%x\n", pool);
        printf("  .d\n");
        printf("    .last = 0x%x\n", pool->d.last);
        printf("    .end = 0x%x\n", pool->d.end);
        printf("    .next = 0x%x\n", pool->d.next);
        printf("    .failed = %d\n", pool->d.failed);
        printf("  .max = %d\n", pool->max);
        printf("  .current = 0x%x\n", pool->current);
        printf("  .chain = 0x%x\n", pool->chain);
        printf("  .large = 0x%x\n", pool->large);
        printf("  .cleanup = 0x%x\n", pool->cleanup);
        printf("  .log = 0x%x\n", pool->log);
        printf("available pool memory = %d\n\n", pool->d.end - pool->d.last);
        pool = pool->d.next;
    }
}
 
void dump_hash_array(ngx_array_t* a)
{
    char prefix[] = "          ";
 
    if (a == NULL)
        return;
 
    printf("array = 0x%x\n", a);
    printf("  .elts = 0x%x\n", a->elts);
    printf("  .nelts = %d\n", a->nelts);
    printf("  .size = %d\n", a->size);
    printf("  .nalloc = %d\n", a->nalloc);
    printf("  .pool = 0x%x\n", a->pool);
 
    printf("  elements:\n");
    ngx_hash_key_t *ptr = (ngx_hash_key_t*)(a->elts);
    for (; ptr < (ngx_hash_key_t*)(a->elts + a->nalloc * a->size); ptr++)
    {
        printf("    0x%x: {key = (\"%s\"%.*s, %d), key_hash = %-10ld, value = \"%s\"%.*s}\n", 
            ptr, ptr->key.data, Max_Url_Len - ptr->key.len, prefix, ptr->key.len, 
            ptr->key_hash, ptr->value, Max_Ip_Len - strlen(ptr->value), prefix);
    }
    printf("\n");
}
 
/**
 * pass array pointer to read elts[i].key_hash, then for getting the position - key
 */
void dump_hash(ngx_hash_t *hash, ngx_array_t *array)
{
    int loop;
    char prefix[] = "          ";
    u_short test[Max_Num] = {0};
    ngx_uint_t key;
    ngx_hash_key_t* elts;
    int nelts;
 
    if (hash == NULL)
        return;
 
    printf("hash = 0x%x: **buckets = 0x%x, size = %d\n", hash, hash->buckets, hash->size);
 
    for (loop = 0; loop < hash->size; loop++)
    {
        ngx_hash_elt_t *elt = hash->buckets[loop];
        printf("  0x%x: buckets[%d] = 0x%x\n", &(hash->buckets[loop]), loop, elt);
    }
    printf("\n");
 
    elts = (ngx_hash_key_t*)array->elts;
    nelts = array->nelts;
    for (loop = 0; loop < nelts; loop++)
    {
        char url[Max_Url_Len + 1] = {0};
 
        key = elts[loop].key_hash % hash->size;
        ngx_hash_elt_t *elt = (ngx_hash_elt_t *) ((u_char *) hash->buckets[key] + test[key]);
 
        ngx_strlow(url, elt->name, elt->len);
        printf("  buckets %d: 0x%x: {value = \"%s\"%.*s, len = %d, name = \"%s\"%.*s}\n", 
            key, elt, (char*)elt->value, Max_Ip_Len - strlen((char*)elt->value), prefix, 
            elt->len, url, Max_Url_Len - elt->len, prefix); //replace elt->name with url
 
        test[key] = (u_short) (test[key] + NGX_HASH_ELT_SIZE(&elts[loop]));
    }
}
 
ngx_array_t* add_urls_to_array(ngx_pool_t *pool)
{
    int loop;
    char prefix[] = "          ";
    ngx_array_t *a = ngx_array_create(pool, Max_Num, sizeof(ngx_hash_key_t));
 
    for (loop = 0; loop < Max_Num; loop++)
    {
        ngx_hash_key_t *hashkey = (ngx_hash_key_t*)ngx_array_push(a);
        hashkey->key = urls[loop];
        hashkey->key_hash = ngx_hash_key_lc(urls[loop].data, urls[loop].len);
        hashkey->value = (void*)values[loop];
        /** for debug
        printf("{key = (\"%s\"%.*s, %d), key_hash = %-10ld, value = \"%s\"%.*s}, added to array\n",
            hashkey->key.data, Max_Url_Len - hashkey->key.len, prefix, hashkey->key.len,
            hashkey->key_hash, hashkey->value, Max_Ip_Len - strlen(hashkey->value), prefix);
        */
    }
 
    return a;    
}
 
void find_test(ngx_hash_t *hash, ngx_str_t addr[], int num)
{
    ngx_uint_t key;
    int loop;
    char prefix[] = "          ";
 
    for (loop = 0; loop < num; loop++)
    {
        key = ngx_hash_key_lc(addr[loop].data, addr[loop].len);
        void *value = ngx_hash_find(hash, key, addr[loop].data, addr[loop].len);
        if (value)
        {
            printf("(url = \"%s\"%.*s, key = %-10ld) found, (ip = \"%s\")\n", 
                addr[loop].data, Max_Url_Len - addr[loop].len, prefix, key, (char*)value);
        }
        else
        {
            printf("(url = \"%s\"%.*s, key = %-10d) not found!\n", 
                addr[loop].data, Max_Url_Len - addr[loop].len, prefix, key);
        }
    }
}


3.2如何編譯

 

請參考nginx-1.0.4源碼分析—內存池結構ngx_pool_t及內存管理一文。本文編寫的makefile文件如下。

CXX = gcc
CXXFLAGS += -g -Wall -Wextra

NGX_ROOT = /usr/src/nginx-1.0.4

TARGETS = ngx_hash_t_test
TARGETS_C_FILE = $(TARGETS).c

CLEANUP = rm -f $(TARGETS) *.o

all: $(TARGETS)

clean:
    $(CLEANUP)

CORE_INCS = -I. \
    -I$(NGX_ROOT)/src/core \
    -I$(NGX_ROOT)/src/event \
    -I$(NGX_ROOT)/src/event/modules \
    -I$(NGX_ROOT)/src/os/unix \
    -I$(NGX_ROOT)/objs \

NGX_PALLOC = $(NGX_ROOT)/objs/src/core/ngx_palloc.o
NGX_STRING = $(NGX_ROOT)/objs/src/core/ngx_string.o
NGX_ALLOC = $(NGX_ROOT)/objs/src/os/unix/ngx_alloc.o
NGX_ARRAY = $(NGX_ROOT)/objs/src/core/ngx_array.o
NGX_HASH = $(NGX_ROOT)/objs/src/core/ngx_hash.o

$(TARGETS): $(TARGETS_C_FILE)
    $(CXX) $(CXXFLAGS) $(CORE_INCS) $(NGX_PALLOC) $(NGX_STRING) $(NGX_ALLOC) $(NGX_ARRAY) $(NGX_HASH) $^ -o $@
 

3.3 運行結果

3.3.1 bucket_size=64字節

bucket_size=64字節時,運行結果如下。

# ./ngx_hash_t_test
--------------------------------
create a new pool:
--------------------------------
pool = 0x8870020
  .d
    .last = 0x8870048
    .end = 0x8870420
    .next = 0x0
    .failed = 0
  .max = 984
  .current = 0x8870020
  .chain = 0x0
  .large = 0x0
  .cleanup = 0x0
  .log = 0x0
available pool memory = 984

--------------------------------
create and add urls to it:
--------------------------------
array = 0x8870048
  .elts = 0x887005c
  .nelts = 7
  .size = 16
  .nalloc = 7
  .pool = 0x8870020
  elements:
    0x887005c: {key = ("www.baidu.com"  , 13), key_hash = 270263191 , value = "220.181.111.147"}
    0x887006c: {key = ("www.sina.com.cn", 15), key_hash = 1528635686, value = "58.63.236.35"   }
    0x887007c: {key = ("www.google.com" , 14), key_hash = -702889725, value = "74.125.71.105"  }
    0x887008c: {key = ("www.qq.com"     , 10), key_hash = 203430122 , value = "60.28.14.190"   }
    0x887009c: {key = ("www.163.com"    , 11), key_hash = -640386838, value = "123.103.14.237" }
    0x88700ac: {key = ("www.sohu.com"   , 12), key_hash = 1313636595, value = "219.234.82.50"  }
    0x88700bc: {key = ("www.abo321.org" , 14), key_hash = 1884209457, value = "117.40.196.26"  }

--------------------------------
the pool:
--------------------------------
pool = 0x8870020
  .d
    .last = 0x88700cc
    .end = 0x8870420
    .next = 0x0
    .failed = 0
  .max = 984
  .current = 0x8870020
  .chain = 0x0
  .large = 0x0
  .cleanup = 0x0
  .log = 0x0
available pool memory = 852

--------------------------------
the hash:
--------------------------------
hash = 0x88700cc: **buckets = 0x88700d8, size = 3
  0x88700d8: buckets[0] = 0x8870100
  0x88700dc: buckets[1] = 0x8870140
  0x88700e0: buckets[2] = 0x8870180

  buckets 1: 0x8870140: {value = "220.181.111.147", len = 13, name = "www.baidu.com"  }
  buckets 2: 0x8870180: {value = "58.63.236.35"   , len = 15, name = "www.sina.com.cn"}
  buckets 1: 0x8870154: {value = "74.125.71.105"  , len = 14, name = "www.google.com" }
  buckets 2: 0x8870198: {value = "60.28.14.190"   , len = 10, name = "www.qq.com"     }
  buckets 0: 0x8870100: {value = "123.103.14.237" , len = 11, name = "www.163.com"    }
  buckets 0: 0x8870114: {value = "219.234.82.50"  , len = 12, name = "www.sohu.com"   }
  buckets 0: 0x8870128: {value = "117.40.196.26"  , len = 14, name = "www.abo321.org" }

--------------------------------
the pool:
--------------------------------
pool = 0x8870020
  .d
    .last = 0x88701c4
    .end = 0x8870420
    .next = 0x0
    .failed = 0
  .max = 984
  .current = 0x8870020
  .chain = 0x0
  .large = 0x0
  .cleanup = 0x0
  .log = 0x0
available pool memory = 604

--------------------------------
find test:
--------------------------------
(url = "www.baidu.com"  , key = 270263191 ) found, (ip = "220.181.111.147")
(url = "www.sina.com.cn", key = 1528635686) found, (ip = "58.63.236.35")
(url = "www.google.com" , key = -702889725) found, (ip = "74.125.71.105")
(url = "www.qq.com"     , key = 203430122 ) found, (ip = "60.28.14.190")
(url = "www.163.com"    , key = -640386838) found, (ip = "123.103.14.237")
(url = "www.sohu.com"   , key = 1313636595) found, (ip = "219.234.82.50")
(url = "www.abo321.org" , key = 1884209457) found, (ip = "117.40.196.26")

(url = "www.china.com"  , key = -1954599725) not found!
(url = "www.csdn.net"   , key = -1667448544) not found!
以上結果是bucket_size=64字節的輸出。由該結果可以看出,對於給定的7個url,程序將其分到了3個bucket中,詳見該結果。該例子的hash物理結構圖如下。


3.3.2 bucket_size=256字節

bucket_size=256字節時,運行結果如下。
# ./ngx_hash_t_test
--------------------------------
create a new pool:
--------------------------------
pool = 0x8b74020
  .d
    .last = 0x8b74048
    .end = 0x8b74420
    .next = 0x0
    .failed = 0
  .max = 984
  .current = 0x8b74020
  .chain = 0x0
  .large = 0x0
  .cleanup = 0x0
  .log = 0x0
available pool memory = 984

--------------------------------
create and add urls to it:
--------------------------------
array = 0x8b74048
  .elts = 0x8b7405c
  .nelts = 7
  .size = 16
  .nalloc = 7
  .pool = 0x8b74020
  elements:
    0x8b7405c: {key = ("www.baidu.com"  , 13), key_hash = 270263191 , value = "220.181.111.147"}
    0x8b7406c: {key = ("www.sina.com.cn", 15), key_hash = 1528635686, value = "58.63.236.35"   }
    0x8b7407c: {key = ("www.google.com" , 14), key_hash = -702889725, value = "74.125.71.105"  }
    0x8b7408c: {key = ("www.qq.com"     , 10), key_hash = 203430122 , value = "60.28.14.190"   }
    0x8b7409c: {key = ("www.163.com"    , 11), key_hash = -640386838, value = "123.103.14.237" }
    0x8b740ac: {key = ("www.sohu.com"   , 12), key_hash = 1313636595, value = "219.234.82.50"  }
    0x8b740bc: {key = ("www.abo321.org" , 14), key_hash = 1884209457, value = "117.40.196.26"  }

--------------------------------
the pool:
--------------------------------
pool = 0x8b74020
  .d
    .last = 0x8b740cc
    .end = 0x8b74420
    .next = 0x0
    .failed = 0
  .max = 984
  .current = 0x8b74020
  .chain = 0x0
  .large = 0x0
  .cleanup = 0x0
  .log = 0x0
available pool memory = 852

--------------------------------
the hash:
--------------------------------
hash = 0x8b740cc: **buckets = 0x8b740d8, size = 1
  0x8b740d8: buckets[0] = 0x8b740e0

  buckets 0: {value = "220.181.111.147", len = 13, name = "www.baidu.com"  }
  buckets 0: {value = "58.63.236.35"   , len = 15, name = "www.sina.com.cn"}
  buckets 0: {value = "74.125.71.105"  , len = 14, name = "www.google.com" }
  buckets 0: {value = "60.28.14.190"   , len = 10, name = "www.qq.com"     }
  buckets 0: {value = "123.103.14.237" , len = 11, name = "www.163.com"    }
  buckets 0: {value = "219.234.82.50"  , len = 12, name = "www.sohu.com"   }
  buckets 0: {value = "117.40.196.26"  , len = 14, name = "www.abo321.org" }

--------------------------------
the pool:
--------------------------------
pool = 0x8b74020
  .d
    .last = 0x8b7419c
    .end = 0x8b74420
    .next = 0x0
    .failed = 0
  .max = 984
  .current = 0x8b74020
  .chain = 0x0
  .large = 0x0
  .cleanup = 0x0
  .log = 0x0
available pool memory = 644

--------------------------------
find test:
--------------------------------
(url = "www.baidu.com"  , key = 270263191 ) found, (ip = "220.181.111.147")
(url = "www.sina.com.cn", key = 1528635686) found, (ip = "58.63.236.35")
(url = "www.google.com" , key = -702889725) found, (ip = "74.125.71.105")
(url = "www.qq.com"     , key = 203430122 ) found, (ip = "60.28.14.190")
(url = "www.163.com"    , key = -640386838) found, (ip = "123.103.14.237")
(url = "www.sohu.com"   , key = 1313636595) found, (ip = "219.234.82.50")
(url = "www.abo321.org" , key = 1884209457) found, (ip = "117.40.196.26")

(url = "www.china.com"  , key = -1954599725) not found!
(url = "www.csdn.net"   , key = -1667448544) not found!
以上結果是bucket_size=256字節的輸出。由給結果可以看出,對於給定的7個url,程序將其放到了1個bucket中,即ngx_hash_init()函數中的size=1,因這7個url的總長度只有140,因此,只需size=1個bucket,即buckets[0]。

 

下表是ngx_hash_init()函數在計算過程中的一些數據。物理結構圖省略,可參考上圖。

 

4. 小結

本文針對nginx-1.0.4的hash結構進行了較爲全面的分析,包括hash結構、hash元素結構、hash初始化結構等,hash操作主要包括hash初始化、hash查找等。最後通過一個簡單例子向讀者展示nginx的hash使用方法,並給出詳細的運行結果,且畫出hash的物理結構圖,以此向圖這展示hash的設計、原理;同時藉此向讀者展示編譯測試nginx代碼的方法。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章