轉自:http://www.cnblogs.com/MerlinJ/p/4081986.html,作爲記錄參考
DPDK以兩種方式對外提供內存管理方法,一個是rte_mempool,主要用於網卡數據包的收發;一個是rte_malloc,主要爲應用程序提供內存使用接口。本文討論rte_mempool。rte_mempool由函數rte_mempool_create()負責創建,從rte_config.mem_config->free_memseg[]中取出合適大小的內存,放到rte_config.mem_config->memzone[]中。
本文中,以l2fwd爲例,說明rte_mempool的創建及使用。
一、rte_mempool的創建
1 l2fwd_pktmbuf_pool = 2 rte_mempool_create("mbuf_pool", NB_MBUF, 3 MBUF_SIZE, 32, 4 sizeof(struct rte_pktmbuf_pool_private), 5 rte_pktmbuf_pool_init, NULL, 6 rte_pktmbuf_init, NULL, 7 rte_socket_id(), 0);
“mbuf_pool”:創建的rte_mempool的名稱。
NB_MBUF:rte_mempool包含的rte_mbuf元素的個數。
MBUF_SIZE:每個rte_mbuf元素的大小。
1 #define RTE_PKTMBUF_HEADROOM 128 2 #define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM) 3 #define NB_MBUF 8192
1 struct rte_pktmbuf_pool_private { 2 uint16_t mbuf_data_room_size; /**< Size of data space in each mbuf.*/ 3 };
rte_mempool由函數rte_mempool_create()負責創建。首先創建rte_ring,再創建rte_mempool,並建立兩者之間的關聯。
1、rte_ring_create()創建rte_ring無鎖隊列
1 r = rte_ring_create(rg_name, rte_align32pow2(n+1), socket_id, rg_flags);
具體步驟如下:
a、需要保證創建的隊列數可以被2整除,即,count = rte_align32pow2(n + 1);
b、計算需要爲count個隊列分配的內存空間,即,ring_size = count * sizeof(void *) + sizeof(struct rte_ring);
struct rte_ring的數據結構如下,
1 struct rte_ring { 2 TAILQ_ENTRY(rte_ring) next; /**< Next in list. */ 3 4 char name[RTE_RING_NAMESIZE]; /**< Name of the ring. */ 5 int flags; /**< Flags supplied at creation. */ 6 7 /** Ring producer status. */ 8 struct prod { 9 uint32_t watermark; /**< Maximum items before EDQUOT. */ 10 uint32_t sp_enqueue; /**< True, if single producer. */ 11 uint32_t size; /**< Size of ring. */ 12 uint32_t mask; /**< Mask (size-1) of ring. */ 13 volatile uint32_t head; /**< Producer head. */ 14 volatile uint32_t tail; /**< Producer tail. */ 15 } prod __rte_cache_aligned; 16 17 /** Ring consumer status. */ 18 struct cons { 19 uint32_t sc_dequeue; /**< True, if single consumer. */ 20 uint32_t size; /**< Size of the ring. */ 21 uint32_t mask; /**< Mask (size-1) of ring. */ 22 volatile uint32_t head; /**< Consumer head. */ 23 volatile uint32_t tail; /**< Consumer tail. */ 24 #ifdef RTE_RING_SPLIT_PROD_CONS 25 } cons __rte_cache_aligned; 26 #else 27 } cons; 28 #endif 29 30 #ifdef RTE_LIBRTE_RING_DEBUG 31 struct rte_ring_debug_stats stats[RTE_MAX_LCORE]; 32 #endif 33 34 void * ring[0] __rte_cache_aligned; /**< Memory space of ring starts here. 35 * not volatile so need to be careful 36 * about compiler re-ordering */ 37 };
c、調用rte_memzone_reserve(),在rte_config.mem_config->free_memseg[]中查找一個合適的free_memseg(查找規則是free_memseg中剩餘內存大於等於需要分配的內存,但是多餘的部分是最小的),從該free_memseg中分配指定大小的內存,然後將分配的內存記錄在rte_config.mem_config->memzone[]中。
d、初始化新分配的rte_ring。
1 r->flags = flags; 2 r->prod.watermark = count; 3 r->prod.sp_enqueue = !!(flags & RING_F_SP_ENQ); 4 r->cons.sc_dequeue = !!(flags & RING_F_SC_DEQ); 5 r->prod.size = r->cons.size = count; 6 r->prod.mask = r->cons.mask = count-1; 7 r->prod.head = r->cons.head = 0; 8 r->prod.tail = r->cons.tail = 0; 9 10 TAILQ_INSERT_TAIL(ring_list, r, next); // 掛到rte_config.mem_config->tailq_head[RTE_TAILQ_RING]隊列中
2、創建並初始化rte_mempool
a、計算需要爲rte_mempool申請的內存空間。包含:sizeof(struct rte_mempool)、private_data_size,以及n * objsz.total_size。
1 mempool_size = MEMPOOL_HEADER_SIZE(mp, pg_num) + private_data_size; 2 if (vaddr == NULL) 3 mempool_size += (size_t)objsz.total_size * n;
objsz.total_size = objsz.header_size + objsz.elt_size + objsz.trailer_size; 其中,
objsz.header_size = sizeof(struct rte_mempool *);
objsz.elt_size = MBUF_SIZE;
objsz.trailer_size = ????
b、調用rte_memzone_reserve(),在rte_config.mem_config->free_memseg[]中查找一個合適的free_memseg,在該free_memseg中分配mempool_size大小的內存,然後將新分配的內存記錄到rte_config.mem_config->memzone[]中。
c、初始化新創建的rte_mempool,並調用rte_pktmbuf_pool_init()初始化rte_mempool的私有數據結構。
1 /* init the mempool structure */ 2 mp = mz->addr; 3 memset(mp, 0, sizeof(*mp)); 4 snprintf(mp->name, sizeof(mp->name), "%s", name); 5 mp->phys_addr = mz->phys_addr; 6 mp->ring = r; 7 mp->size = n; 8 mp->flags = flags; 9 mp->elt_size = objsz.elt_size; 10 mp->header_size = objsz.header_size; 11 mp->trailer_size = objsz.trailer_size; 12 mp->cache_size = cache_size; 13 mp->cache_flushthresh = (uint32_t) 14 (cache_size * CACHE_FLUSHTHRESH_MULTIPLIER); 15 mp->private_data_size = private_data_size; 16 17 /* calculate address of the first element for continuous mempool. */ 18 obj = (char *)mp + MEMPOOL_HEADER_SIZE(mp, pg_num) + 19 private_data_size; 20 21 /* populate address translation fields. */ 22 mp->pg_num = pg_num; 23 mp->pg_shift = pg_shift; 24 mp->pg_mask = RTE_LEN2MASK(mp->pg_shift, typeof(mp->pg_mask)); 25 26 /* mempool elements allocated together with mempool */ 27 mp->elt_va_start = (uintptr_t)obj; 28 mp->elt_pa[0] = mp->phys_addr + 29 (mp->elt_va_start - (uintptr_t)mp); 30 31 mp->elt_va_end = mp->elt_va_start; 32 33 RTE_EAL_TAILQ_INSERT_TAIL(RTE_TAILQ_MEMPOOL, rte_mempool_list, mp); //掛到rte_config.mem_config->tailq_head[RTE_TAILQ_MEMPOOL]隊列中
d、調用mempool_populate(),以及rte_pktmbuf_init()初始化rte_mempool的每個rte_mbuf元素。
3、總結
相關數據結構的關聯關係如下圖:
二、rte_mempool的調用
未完,待續。。。。
錯誤之處,歡迎指出。