內存管理 migration type中MIGRATE_HIGHATOMIC

##內存管理 migration type中MIGRATE_HIGHATOMIC

在內存管理中 新增MIGRATE_HIGHATOMIC遷移類型(migration type),從名字上大體有個猜測。菜企鵝在LWN上找到一篇關於MIGRATE_HIGHATOMIC的文章。

###The atomic high-order reserve

Within a zone, memory is grouped into “page blocks,” each of which can be marked with a “migration type” describing how the block should be allocated. In current kernels, one of those types is MIGRATE_RESERVE; it marks memory that simply will not be allocated at all unless the alternative is to fail an allocation request entirely. Since a physically contiguous range of blocks is so marked, the effect of this policy is to maintain a minimum number of high-order pages in the system. That, in turn, means that high-order requests (within reason) can be satisfied even when memory is generally fragmented.

Mel added the migration reserve during the 2.6.24 development cycle in 2007. The reserve improved the situation at the time but, in the end, it relied on an accidental property of the minimum watermark implemented in the page allocator many years before. The reserve does not actively keep high-order pages around; it simply steers requests away from a specific range of memory unless there is no alternative, in the hope that said range will remain contiguous. The reserve also predates the current memory-management code, which does a far better job of avoiding fragmentation and performing compaction when fragmentation does occur. Mel’s current patch set implements the conclusion that this reserve has done its time and removes it.

There is still value in reserving blocks of memory for high-order allocations, though; fragmentation is still a concern in current kernels. So another part of Mel’s patch set creates a new MIGRATE_HIGHATOMIC reserve that serves this purpose, but in a different way. Initially, this reserve contains no page blocks at all. If a high-order allocation cannot be satisfied without breaking up a previously whole page block, that block will be marked as being part of the high-order atomic reserve; thereafter, only higher-order allocations (and only high-priority ones at that) can be satisfied from that page block.

The kernel will limit the size of this reserve to about 1% of memory, so it cannot grow overly large. Page blocks remain in this reserve until memory pressure reaches a point where a single-page allocation is about to fail; at this point, the kernel will take a page block out of the reserve to be able to satisfy that request. The end result is a high-order page reserve that is more flexible, growing or shrinking in response to the demands of the current workload. Since the demand for high-order pages can vary significantly from one system (and one workload) to the next, it makes sense to tune the reserve to what is actually running; the result should be more flexible allocations and higher-reliability access to high-order pages.

Lest kernel developers think that they can be more relaxed about high-order allocations in the future, though, Mel notes that, as a result of the limited size of the reserve, “callers that speculatively abuse atomic allocations for long-lived high-order allocations to access the reserve will quickly fail.” He gives no indication of just who he thinks those callers are, though. There is one other potential pitfall with this reserve that bears keeping in mind: since the first page block doesn’t enter the reserve until a high-order allocation is made, the reserve may remain empty until the system has been running for a long time. By that point, memory may be so fragmented that the reserve can no longer be populated. Should such situations arise in real-world use, they could be addressed by proactively putting a minimum amount of memory into the reserve at boot time.

The high-order reserve also makes it possible to remove the separate watermarks for high-order pages. These watermarks try to ensure that each zone has a minimal number of pages available at each order; the allocator will fail allocations that drive the level below the relevant watermark for all but the highest-priority allocations. These watermarks are relatively expensive to implement and can cause normal-priority allocations to fail even though suitable pages are available. With the patch set applied, the code continues to enforce the single-page watermark, but, for higher-order allocations, it merely checks that a suitable page is available, counting on the high-order reserve to ensure that pages will be kept available for high-priority allocations.

摘錄在LWN :https://lwn.net/Articles/658081/ The atomic high-order reserve一段

菜企鵝理解是:
系統運行一段時間後,會出現大量內存碎片,會導致高階頁塊(high-order page)的分配失敗。爲了避免,減輕這種情況,所以創建了MIGRATE_HIGHATOMIC。起初MIGRATE_HIGHATOMIC保留序列爲空,當高階頁塊分配成功後,就會有可能標記這個頁塊爲MIGRATE_HIGHATOMIC。在此後的分配中,只有當相同的高階,並擁有高級分配權限時,纔會分配這樣的頁塊。當分配單個頁框失敗時,這樣的頁塊可能會改變。這樣的好處是提高了系統的靈活性。

linux-4.4內存管理代碼分析:

釋放MIGRATE_HIGHATOMIC類型的頁塊

/*
 * this is the 'heart' of the zoned buddy allocator
 */
__alloc_pages_nodemask(){
	...
	/* first allocation attempt */
	page = get_page_from_freelst();		
	if(unlickely(!page)){/* 第一次分配失敗 */
		...
		page = __alloc_pages_slowpath();/*調用__alloc_pages_direct_reclaim() */
	}
	...
}

/* the really slow allocator path where we enter direct reclaim */
__alloc_page_direct_reclaim(){
	...
	unreserve_highatomic_pageblock();/* 釋放MIGRATE_HIGHATOMIC類型的頁塊 */
}

標記爲MIGRATE_HIGHATOMIC類型的頁塊

get\_page\_from\_freelist(){
	....
	page = buffered_rmqueue();
	if(page){
		...
		/* 將高階頁塊,並系統急需內存時,將這樣的頁框標記爲MIGRATE_HIGHATOMIC*/
		if(unlikely(order && (alloc_flags & ALLOC_HARDER)))
		/* 標記爲MIGRATE_HIGHATOMIC類型的頁框,既系統保留MIGRATE_HIGHATOMIC類型的頁塊 */
			reserve_highatomic_pageblock(page, zone, order);

		return page;
	}
	...
}

buffered\_rmqueue(){
	....
	if(order == 0){
		...
	}
	else{
		...
		/* 當相同的高階,並擁有高級分配權限時,纔會分配同樣階的頁塊 *、
		if(alloc_flags & ALLOC_HARDER){
			page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC);
			...
		}
		...
	}
	...
}

如果有不同意見,十分歡迎指教。聯繫郵箱:[email protected]

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章