linux-2.6.25啓動分析(引導+內核初始化)


一:bootload加載階段
在嵌入式系統中,一般的環境初始化都是在bootload中完成的.由bootload完成基本硬件環境的初始化之後,會將kernel image加載到一個區域.而在x86中.開機之後的環境初始化是由bios提供的功能來完成的.然後跳轉到活動分區對應的引導程序.
這裏的kernel image加載是有講究的.這要從kernel image的組成說起:
Linux的系統映像其實是一個引導層加上kernel代碼映像構成.不妨去查看一下關於make bzimage的過程.它是通過linux-2.6.25/arch/x86/boot/tools/build.c生成的build工具,將linux-2.6.25/arch/x86/boot/head.s生成的文件將kenel壓縮或者完全的映射聯合在一起.

基於這樣的特徵,啓動過程要從head.s部份跳轉到kernel code部份,因此需要將kernel code加載到一個固定的地址.對於壓縮的kernel.會加載到0x1000.對於完成的kernel.會將其加載到0x100000.

上面的流程,如下圖所示:










另外: 對於head.S生成之後的文件也是有講究的.它包含自帶的一段啓動程序和一段初始化代碼.在bootload時代的今天,linux自帶的啓動程序是毫無用途的.而內核開發者也不想再維護引導的這一段啓動代碼.於是,如果用linux自帶的代碼的會,將就在屏幕上顯示一個錯誤提示.這在我們後面的代碼分析中可以看到.啓動程序段位於head.S生成文件的前512K.bootload會跳轉到加載位置的512偏移處開始執行.head.S的鏈接腳本內容如下:
linux-2.6.25/arch/x86/boot/setup.ld
1 /*

      2  * setup.ld

      3  *

      4  * Linker script for the i386 setup code

      5  */

      6 OUTPUT_FORMAT("elf32-i386", "elf32-i386", "elf32-i386")

      7 OUTPUT_ARCH(i386)
      8 ENTRY(_start)
      9
     10 SECTIONS
     11 {

     12         . = 0;

     13         .bstext         : { *(.bstext) }

     14         .bsdata         : { *(.bsdata) }

     15

     16         . = 497;

     17         .header         : { *(.header) }

     18         .inittext       : { *(.inittext) }

     19         .initdata       : { *(.initdata) }

     20         .text           : { *(.text*) }

     21

     22         . = ALIGN(16);

     23         .rodata         : { *(.rodata*) }
     24

     25         .videocards     : {

     26                 video_cards = .;
     27                 *(.videocards)
     28                 video_cards_end = .;
29         }
     30

     31         . = ALIGN(16);

     32         .data           : { *(.data*) }

     33

     34         .signature      : {

     35                 setup_sig = .;
     36                 LONG(0x5a5aaa55)

     37         }

     38
     39

     40         . = ALIGN(16);

     41         .bss            :

     42         {

     43                 __bss_start = .;
     44                 *(.bss)
     45                 __bss_end = .;

     46         }

     47         . = ALIGN(16);

     48         _end = .;

     49

     50         /DISCARD/ : { *(.note*) }

     51

     52         . = ASSERT(_end <= 0x8000, "Setup too big!");

     53         . = ASSERT(hdr == 0x1f1, "The setup header has the wrong offset!");

     54 }
 
注意在16~17行.指明瞭header放在497的偏移處.
Head.S的部份代碼如下:
linux-2.6.25/arch/x86/boot/head.S
……
     .section ".header", "a"
     .globl   hdr
#下面的代碼是初始化hdr的成員.程序的執行流程會在_start通過jump的機器碼跳轉出去    
hdr:
setup_sects:  .byte SETUPSECTS
root_flags:   .word ROOT_RDONLY
syssize: .long SYSSIZE
ram_size: .word RAMDISK
vid_mode: .word SVGA_MODE
root_dev: .word ROOT_DEV
boot_flag:    .word 0xAA55
 
     # offset 512, entry point
     #這裏是偏移512字節的地方。bootload加載kernel之後的入口
     .globl   _start
_start:

         # Explicitly enter this as bytes, or the assembler

         # tries to generate a 3-byte jump here, which causes

         # everything else to push off to the wrong offset.

 
         #這裏實際上是jmp的操作碼.
         .byte    0xeb     # short (2-byte) jump
         .byte    start_of_setup-1f
……
上面說到.header的偏移是在497處.那512的偏移剛好到了_start.這也就是從bootload跳轉進來的入口.順帶在這裏提一下.hdr中存放的就是引導的各項參數.對應着一個struct setup_header 結構
在這裏,_start通過jmp的操作碼跳轉到了start_of_setup.這樣做是爲了不破壞hdr中的其它成員初始化.
轉到start_of_setup:
#跳轉後的入口
start_of_setup:
 

#如果定義了SAFE_RESET_DISK_CONTROLLER 重啓磁盤控制器

#ifdef SAFE_RESET_DISK_CONTROLLER

# Reset the disk controller.

     movw $0x0000, %ax       # Reset disk controller

     movb $0x80, %dl         # All disks

     int  $0x13
#endif
#設置es寄存器值爲ds的內容

# Force %es = %ds

     movw %ds, %ax

     movw %ax, %es

     cld
#因爲接下來要call c fuction.先設置好堆棧

# Apparently some ancient versions of LILO invoked the kernel with %ss != %ds,

# which happened to work by accident for the old code.  Recalculate the stack

# pointer if %ss is invalid.  Otherwise leave it alone, LOADLIN sets up the

# stack behind its own code, so we can't blindly put it directly past the heap.

 

     movw %ss, %dx

     cmpw %ax, %dx # %ds == %ss?

     movw %sp, %dx

     je   2f       # -> assume %sp is reasonably set

 
     # Invalid %ss, make up a new stack

     movw $_end, %dx

     testb    $CAN_USE_HEAP, loadflags

     jz   1f

     movw heap_end_ptr, %dx

1:   addw $STACK_SIZE, %dx

     jnc  2f

     xorw %dx, %dx # Prevent wraparound

 

2:   # Now %dx should point to the end of our stack space

     andw $~3, %dx # dword align (might as well...)

     jnz  3f

     movw $0xfffc, %dx  # Make sure we're not zero

3:   movw %ax, %ss

     movzwl   %dx, %esp # Clear upper half of %esp

     sti           # Now we should have a working stack

 

# We will have entered with %cs = %ds+0x20, normalize %cs so

# it is on par with the other segments.

     pushw    %ds
     pushw    $6f
     lretw
6:
 
#判斷setup_sig 與$0x5a5aaa55是否相等,在link的時候,會將setup_sig設爲$0x5a5aaa55

# Check signature at end of setup

     cmpl $0x5a5aaa55, setup_sig
     jne  setup_bad
 
#清空BSS

# Zero the bss

     movw $__bss_start, %di

     movw $_end+3, %cx

     xorl %eax, %eax

     subw %di, %cx

     shrw $2, %cx

     rep; stosl
 
#跳轉到main

# Jump to C code (should not return)

     calll    main
 

在這裏,設置好了堆棧之後,call main,跳轉到了用C寫的函數裏.在這個函數裏會初始化一部份硬件環境.要注意的是,迄今爲止.還一直運行在實模式.

Main的代碼如下:
linux-2.6.25/arch/x86/boot/main.c
void main(void)
{

         /* First, copy the boot header into the "zeropage" */

         copy_boot_params();
 
         /* End of heap check */
         init_heap();
 

         /* Make sure we have all the proper CPU support */

         //驗證CPU 是否有效
         if (validate_cpu()) {

                   puts("Unable to boot - please use a kernel appropriate "

                        "for your CPU./n");
                   die();
         }
 

         /* Tell the BIOS what CPU mode we intend to run in. */

         //設置CPU的工作模式
         set_bios_mode();
 
         /* Detect memory layout */
         //調用int 0x15 向bios 瞭解當前的內存佈局
        detect_memory();
 
         /* Set keyboard repeat rate (why?) */
         keyboard_set_repeat();
 
         /* Query MCA information */
         //檢查IBM 微通道總線
         query_mca();
 
         /* Voyager */

#ifdef CONFIG_X86_VOYAGER

         query_voyager();
#endif
 

         /* Query Intel SpeedStep (IST) information */

         query_ist();
 
         /* Query APM information */

#if defined(CONFIG_APM) || defined(CONFIG_APM_MODULE)

         query_apm_bios();
#endif
 
         /* Query EDD information */

#if defined(CONFIG_EDD) || defined(CONFIG_EDD_MODULE)

         query_edd();
#endif
 
         /* Set the video mode */
         set_video();
 

         /* Do the last things and invoke protected mode */

         //通過這個跳轉到保護模式了
         go_to_protected_mode();
}
在copy_boot_params()中,會將hdr的值copy到一個全局變量boot_params中.如下所示:

static void copy_boot_params(void)

{
         ……

         memcpy(&boot_params.hdr, &hdr, sizeof hdr);

}
在detect_memory()中會調用0x15完成對內存的初步探測.並將其保存在boot_params.e820_map
Main()最終會調用go_to_protected_mode().從字面意思可以看出,這個函數會將其轉換到保護模式.

void go_to_protected_mode(void)

{

         /* Hook before leaving real mode, also disables interrupts */

         //禁用中斷
         realmode_switch_hook();
 

         /* Move the kernel/setup to their final resting places */

         //移動kernel到0x10000
         move_kernel_around();
 
         /* Enable the A20 gate */
         //置位鍵盤的a20 引腳
         if (enable_a20()) {

                   puts("A20 gate not responding, unable to boot.../n");

                   die();
         }
 
         /* Reset coprocessor (IGNNE#) */
         //重置協處理器
         reset_coprocessor();
 
         /* Mask all interrupts in the PIC */
         //在pic中屏弊掉所有中斷
         mask_all_interrupts();
 

         /* Actual transition to protected mode... */

         //建立臨時的idt 和gdt
        
         //將IDT清空
         setup_idt();
         setup_gdt();
         //跳轉到內核的起點處```即header.S
         //TODO: 對於壓縮的kernel來說,這裏還是從0x1000處運行,並沒有跳轉到搬移中運行
         protected_mode_jump(boot_params.hdr.code32_start,
                                (u32)&boot_params + (ds() << 4));
}
首先,會調用setup_idt()和setup_gdt()建立一個臨時的IDT和GDT.代碼如下:

static void setup_idt(void)

{

         static const struct gdt_ptr null_idt = {0, 0};

         asm volatile("lidtl %0" : : "m" (null_idt));

}
可以看到,這個臨時的IDT是空的.

static void setup_gdt(void)

{

         /* There are machines which are known to not boot with the GDT

            being 8-byte unaligned.  Intel recommends 16 byte alignment. */

         static const u64 boot_gdt[] __attribute__((aligned(16))) = {

                   /* CS: code, read/execute, 4 GB, base 0 */

                   [GDT_ENTRY_BOOT_CS] = GDT_ENTRY(0xc09b, 0, 0xfffff),

                   /* DS: data, read/write, 4 GB, base 0 */

                   [GDT_ENTRY_BOOT_DS] = GDT_ENTRY(0xc093, 0, 0xfffff),

                   /* TSS: 32-bit tss, 104 bytes, base 4096 */

                   /* We only have a TSS here to keep Intel VT happy;

                      we don't actually use it for anything. */

                   [GDT_ENTRY_BOOT_TSS] = GDT_ENTRY(0x0089, 4096, 103),

         };

         /* Xen HVM incorrectly stores a pointer to the gdt_ptr, instead

            of the gdt_ptr contents.  Thus, make it static so it will

            stay in memory, at least long enough that we switch to the

            proper kernel GDT. */

         static struct gdt_ptr gdt;
 
         gdt.len = sizeof(boot_gdt)-1;

         gdt.ptr = (u32)&boot_gdt + (ds() << 4);

 

         asm volatile("lgdtl %0" : : "m" (gdt));

}

在這裏看到.GDT初始化了三項. GDT_ENTRY_BOOT_CS, GDT_ENTRY_BOOT_DS和GDT_ENTRY_BOOT_TSS.其中GDT_ENTRY_BOOT_CS和GDT_ENTRY_BOOT_DS基地址都爲零.段限長都是4G. 實際上GDT_ENTRY_BOOT_TSS是沒有被使用到的

具體從實模式到保護模式的切換是在protected_mode_jump中完成的.代碼如下:

linux-2.6.25/arch/x86/boot/ pmjump.S

protected_mode_jump:
         #edx:存放第二個參數,即bootparams

         movl %edx, %esi                   # Pointer to boot_params table

 

         xorl   %ebx, %ebx

         movw         %cs, %bx

         shll    $4, %ebx

         addl   %ebx, 2f

 
#設置CX -> __BOOT_DS , di -> __BOOT_TSS

         movw         $__BOOT_DS, %cx

         movw         $__BOOT_TSS, %di

 
#將CR0 的PE位置1. 開啓了保護模式

         movl %cr0, %edx

         orb    $X86_CR0_PE, %dl    # Protected mode

         movl %edx, %cr0

         jmp   1f                         # Short jump to serialize on 386/486

1:
 
         # Transition to 32-bit mode

         .byte 0x66, 0xea           # ljmpl opcode

2:      .long  in_pm32                       # offset
         .word         __BOOT_CS               # segment
 

         .size  protected_mode_jump, .-protected_mode_jump

 
         .code32

         .type in_pm32, @function

in_pm32:

         # Set up data segments for flat 32-bit mode

         #設置段寄存器

         movl %ecx, %ds

         movl %ecx, %es

         movl %ecx, %fs

         movl %ecx, %gs

         movl %ecx, %ss

         # The 32-bit code sets up its own stack, but this way we do have

         # a valid stack if some debugging hack wants to use it.

         addl   %ebx, %esp

 

         # Set up TR to make Intel VT happy

         ltr      %di
 

         # Clear registers to allow for future extensions to the

         # 32-bit boot protocol
         #清除普通寄存器

         xorl   %ecx, %ecx

         xorl   %edx, %edx

         xorl   %ebx, %ebx

         xorl   %ebp, %ebp

         xorl   %edi, %edi

 

         # Set up LDTR to make Intel VT happy

         lldt    %cx
 
         //跳轉到指定的入口了

         jmpl  *%eax                           # Jump to the 32-bit entrypoint

 
首先protected_mode_jump函數是用寄存器來傳值的,第一個參數放eax,第二個參數在edx中.
這個函數的兩個參數如下示:
protected_mode_jump(boot_params.hdr.code32_start,
                                (u32)&boot_params + (ds() << 4));
一個是轉換到保護模式下要跳轉到的地址,在壓縮的情況下,這個值是0x1000.末壓縮情況下,這個值是0x10000.另一個值是引導參數.
這個函數在將引導參數移到esi後,置位段寄存器,清空普通寄存器,然後主跳轉到了指定的位置.
這個位置是在arch/boot/kernel/head_32.S
 在分析這段代碼之後,我們先來看下它的鏈接腳本:
linux-2.6.25/arch/x86/kernel/vmlinux_32.lds.S

#define LOAD_OFFSET __PAGE_OFFSET

 

#include <asm-generic/vmlinux.lds.h>

#include <asm/thread_info.h>

#include <asm/page.h>

#include <asm/cache.h>

#include <asm/boot.h>

 

OUTPUT_FORMAT("elf32-i386", "elf32-i386", "elf32-i386")

OUTPUT_ARCH(i386)
ENTRY(phys_startup_32)

jiffies = jiffies_64;

 
PHDRS {
        text PT_LOAD FLAGS(5);  /* R_E */

        data PT_LOAD FLAGS(7);  /* RWE */

        note PT_NOTE FLAGS(0);  /* ___ */
}
SECTIONS
{
  . = LOAD_OFFSET + LOAD_PHYSICAL_ADDR;
  phys_startup_32 = startup_32 - LOAD_OFFSET;
 

  .text.head : AT(ADDR(.text.head) - LOAD_OFFSET) {

        _text = .;                      /* Text and read-only data */

        *(.text.head)
  } :text = 0x9090
……
……

所有的SECTIONS是從LOAD_OFFSET + LOAD_PHYSICAL_ADDR開始的. LOAD_OFFSET就是我們經常看到的PAGE_OFFSET. LOAD_PHYSICAL_ADDR在沒有壓縮kernel的情況就是0x100000.這也就是kernel線性地址到物理地址轉換關係的由來.

 
接着看arch/boot/kernel/head_32.S的代碼:
由於該代碼篇幅較長,分段分析如下,省略掉了選擇編譯的部份.
 
# 低端頁面總數 1<<32 / 1<<12

LOW_PAGES = 1<<(32-PAGE_SHIFT_asm)

 
/*

 * To preserve the DMA pool in PAGEALLOC kernels, we'll allocate

 * pagetables from above the 16MB DMA limit, so we'll have to set

 * up pagetables 16MB more (worst-case):
 */

#ifdef CONFIG_DEBUG_PAGEALLOC

LOW_PAGES = LOW_PAGES + 0x1000000

#endif
 

#if PTRS_PER_PMD > 1

#PTD和PMD所佔空間

PAGE_TABLE_SIZE = (LOW_PAGES / PTRS_PER_PMD) + PTRS_PER_PGD

#else
#PTD所佔的頁面數(每個PTE佔一個頁面)

PAGE_TABLE_SIZE = (LOW_PAGES / PTRS_PER_PGD)

#endif
//用位來表示頁面的數組大小

BOOTBITMAP_SIZE = LOW_PAGES / 8

ALLOCATOR_SLOP = 4
 
//總共所佔空間的大小

INIT_MAP_BEYOND_END = BOOTBITMAP_SIZE + (PAGE_TABLE_SIZE + ALLOCATOR_SLOP)*PAGE_SIZE_asm         

 
這部份計算頁面位圖與PTE,PMD所佔空間.在這裏之所以不要保存PGD所佔空間是因爲,PGD的區域是在kernel中鏈接的時候指定的,屬於靜太區域
 
#重新設置GDT.之所以重新配置,是因爲整個vmlinux是從__PAGE_OFFSET偏移安放的
         lgdt pa(boot_gdt_descr)
         movl $(__BOOT_DS),%eax
         movl %eax,%ds
         movl %eax,%es
         movl %eax,%fs
         movl %eax,%gs
2:
 
/*

 * Clear BSS first so that there are no surprises...

 */
 
 #清空BSS
         cld
         xorl %eax,%eax
         movl $pa(__bss_start),%edi
         movl $pa(__bss_stop),%ecx
         subl %edi,%ecx
         shrl $2,%ecx
         rep ; stosl
在這裏重新設置GDT,清空BSS段
 
         #esi中已經存放了boot_parms的值
         movl $pa(boot_params),%edi
         movl $(PARAM_SIZE/4),%ecx
         cld
         rep
         //將esi中的值copy到edi中,也就是boot_params對應的內存空間處
         movsl

         movl pa(boot_params) + NEW_CL_POINTER,%esi

         andl %esi,%esi

         jz 1f                     # No comand line

         movl $pa(boot_command_line),%edi
         movl $(COMMAND_LINE_SIZE/4),%ecx
         rep
         movsl
 
將引導參數保存到boot_params.將command_line保存到boot_command_line
 
#沒有配置PAE
#__PAGE_OFFSET對應的頁目錄索引

page_pde_offset = (__PAGE_OFFSET >> 20);

 
#pg0:臨時的頁表項。映射前面4M 內存空間大小
         movl $pa(pg0), %edi
         movl $pa(swapper_pg_dir), %edx
         movl $PTE_ATTR, %eax
10:
         #edi中存放了pg0的地址。PDE_ATTR(%edi):會成生一個PDE項
         leal PDE_ATTR(%edi),%ecx                   /* Create PDE entry */
        
         #將生成的PDE設爲PGD的第0項
         movl %ecx,(%edx)                          /* Store identity PDE entry */
        
         #將生成的PDE設爲PGD的page_pde_offset項即0x300
         movl %ecx,page_pde_offset(%edx)                  /* Store kernel PDE entry */
 
         #即edx 指向pgd的第二項
         addl $4,%edx
        
         #接下來就是設置pg0的值了.
 
         #設置循環次數
         movl $1024, %ecx
11:

         #將eax-> edi . edi存放的是pg0的地址

         stosl
         addl $0x1000,%eax                #eax = eax +0x1000 (0x1000 = 4K)
         loop 11b
 
         #經過上面的循環之後,pg0中的內容依次被設置爲:0x007, 0x1007,0x2007...0x3FF007
         #這次從線性地址0開始的第一個PGD項和從__PAGE_OFFSET開始的第一個PGD都可以對前4M 進行尋址了
        
         /*

          * End condition: we must map up to and including INIT_MAP_BEYOND_END

          * bytes beyond the end of our own page tables; the +0x007 is

          * the attribute bits
          */
         # 注意這裏要一直映射到INIT_MAP_BEYOND_END

         leal (INIT_MAP_BEYOND_END+PTE_ATTR)(%edi),%ebp

         #判斷INIT_MAP_BEYOND_END是否有映射,如果沒有映射關係,就跳轉到10.建立映射關係
         cmpl %ebp,%eax
         jb 10b
 
         #將最後的頁表項存入init_pg_tables_end
         movl %edi,pa(init_pg_tables_end)
 

         /* Do early initialization of the fixmap area */

         //爲fixmap建立映射關係

         movl $pa(swapper_pg_fixmap)+PDE_ATTR,%eax

         movl %eax,pa(swapper_pg_dir+0xffc)
 
在這裏,初始化映射區
 
//開啓分頁
         movl $pa(swapper_pg_dir),%eax
         movl %eax,%cr3           /* set the page table pointer.. */
         movl %cr0,%eax

         orl  $X86_CR0_PG,%eax

         movl %eax,%cr0           /* ..and set paging (PG) bit */

         ljmp $__BOOT_CS,$1f        /* Clear prefetch and normalize %eip */

1:
         /* Set up the stack pointer */
         //建立內核態堆棧
         lss stack_start,%esp
 
在這裏將stack_start作爲堆棧段,也就是對應系統第一個kernel進程
 
//建立idt
         call setup_idt
……
……
         //跳轉到start_kernel
         jmp start_kernel     
 
 
之後跳轉到start_kernel中,完成了第一階段的啓動





二:第二啓動階段
第二啓動階段也即start_kernel()階段.在這個階段.會進行更加具體而全面的系統初始化.  在這個階段裏,我們主要分析內存管理的初始化.這部份是最重要也是最繁雜的部份.我們從start_kernel()中摘取與內存管理相關的子函數進行分析.
第一個要分析的函數是setup_arch().這是每個平臺的初始化.代碼如下:
Setup_arch()中與內存管理相關的函數如下所示:

void __init setup_arch(char **cmdline_p)

{
         //ioremap映射區域的初始化
         early_ioremap_init();    
……. 
//調整e820 位圖並將其打印出來

//  對bios取得的e820圖進行調整,然後將其copy 到e820

print_memory_map(memory_setup());   
……
//max_pfn: 最大的頁面號

find_max_pfn();  

……
//返回內核所能映射的最大頁面數
max_low_pfn = setup_memory();               
……
paging_init();
zone_sizes_init();
}
 
Setup_arch()  --à  early_ioremap_init()代碼如下:

void __init early_ioremap_init(void)

{
         pmd_t *pmd;
 
         if (early_ioremap_debug)

                   printk(KERN_INFO "early_ioremap_init()/n");

 

         pmd = early_ioremap_pmd(fix_to_virt(FIX_BTMAP_BEGIN));

         /*在這裏會將FIX_BTMAP_BEGIN 段的頁面表固定使用bm_pte*/

         memset(bm_pte, 0, sizeof(bm_pte));

         pmd_populate_kernel(&init_mm, pmd, bm_pte);

 
         /*

          * The boot-ioremap range spans multiple pmds, for which

          * we are not prepared:
          */

         if (pmd != early_ioremap_pmd(fix_to_virt(FIX_BTMAP_END))) {

                   WARN_ON(1);

                   printk(KERN_WARNING "pmd %p != %p/n",

                          pmd, early_ioremap_pmd(fix_to_virt(FIX_BTMAP_END)));

                   printk(KERN_WARNING "fix_to_virt(FIX_BTMAP_BEGIN): %08lx/n",

                            fix_to_virt(FIX_BTMAP_BEGIN));

                   printk(KERN_WARNING "fix_to_virt(FIX_BTMAP_END):   %08lx/n",

                            fix_to_virt(FIX_BTMAP_END));
 

                   printk(KERN_WARNING "FIX_BTMAP_END:       %d/n", FIX_BTMAP_END);

                   printk(KERN_WARNING "FIX_BTMAP_BEGIN:     %d/n",

                          FIX_BTMAP_BEGIN);
         }
}
上面的這段代碼,使FIX_BTMAP_BEGIN爲起始地址對應的一個PMD對應映射的地址區間.即固定映射到bm_pte.
細心的讀者可以發現了.從FIX_BTMAP_BEGIN開始的一個PMD映射區間對應就是永久內存映射的線性地址段.沒錯,就是它.
一般說來,永久內存映射地址段只在一個PMD範圍內.若有超出一個PMD.則打印出警告信息.
 
Setup_arch() --à print_memory_map(memory_setup());
Memory_setup():我們在講述啓動的第一階段的時候曾分析到.內核調用int 0x15取得內存信息,然後保存在boot_params.e820_map中.有時候bios提供的映射信息也並不一定正確,比如有些地方會重複.所以.在這裏函數裏對bios取得的信息進行正確調整,然後將其保存到全局變量e820中.
E820的定義如下:
struct e820entry {
         //內存圖起始地址
         __u64 addr;         /* start of memory segment */
         //內存圖大小

__u64 size;          /* size of memory segment */

//內存圖類型

__u32 type;        /* type of memory segment */

} __attribute__((packed));

 
struct e820map {
         //內存圖總項數
         __u32 nr_map;
//內存項數組     
         struct e820entry map[E820MAX];
};
Prit_memory_map()則講e820中的信息打印出來.就這是我們在開機的時候看到有e820映射圖.代碼如下所示:

void __init print_memory_map(char *who)

{
         int i;
 
         for (i = 0; i < e820.nr_map; i++) {

                   printk(" %s: %016Lx - %016Lx ", who,

                            e820.map[i].addr,

                            e820.map[i].addr + e820.map[i].size);

                   switch (e820.map[i].type) {
                   case E820_RAM:         printk("(usable)/n");
                                     break;
                   case E820_RESERVED:
                                     printk("(reserved)/n");
                                     break;
                   case E820_ACPI:

                                     printk("(ACPI data)/n");

                                     break;
                   case E820_NVS:

                                     printk("(ACPI NVS)/n");

                                     break;

                   default:       printk("type %u/n", e820.map[i].type);

                                     break;
                   }
         }
}
 
Setup_arch() --à find_max_pfn():找到最大的物理頁面號

void __init find_max_pfn(void)

{
         int i;
 
         max_pfn = 0;
 
         for (i = 0; i < e820.nr_map; i++) {
                   unsigned long start, end;
                   /* RAM? */

                   if (e820.map[i].type != E820_RAM)

                            continue;

                   start = PFN_UP(e820.map[i].addr);

                   end = PFN_DOWN(e820.map[i].addr + e820.map[i].size);

                   if (start >= end)
                            continue;
                   if (end > max_pfn)
                            max_pfn = end;

                   memory_present(0, start, end);

         }
該函數比較簡單,就是搜索e820圖中的可用內存的最高頁面號.
PFN_UP() PFN_DWON()定義如下:
#define PFN_UP(x)     (((x) + PAGE_SIZE-1) >> PAGE_SHIFT)
#define PFN_DOWN(x)       ((x) >> PAGE_SHIFT)
兩者的區別是:PFN_DOWN():線性地址向下取頁面號.而PFN_UP()是向上取頁面號.
 
Setup_arch() --à setup_memory():

static unsigned long __init setup_memory(void)

{
/*
* partially used pages are not usable - thus
* we are rounding upwards:
*/
//init_pg_tables_end: 映射之後的最高頁表項地址
 
//min_low_pfn: 映射的起始頁面號
min_low_pfn = PFN_UP(init_pg_tables_end);
//max_low_pfn: 內核能見的最高頁面號
max_low_pfn = find_max_low_pfn();
#ifdef CONFIG_HIGHMEM
highstart_pfn = highend_pfn = max_pfn;
if (max_pfn > max_low_pfn) {
highstart_pfn = max_low_pfn;
}

printk(KERN_NOTICE "%ldMB HIGHMEM available./n",

pages_to_mb(highend_pfn - highstart_pfn));
num_physpages = highend_pfn;

high_memory = (void *) __va(highstart_pfn * PAGE_SIZE - 1) + 1;

#else
num_physpages = max_low_pfn;

high_memory = (void *) __va(max_low_pfn * PAGE_SIZE - 1) + 1;

#endif
#ifdef CONFIG_FLATMEM
max_mapnr = num_physpages;
#endif

printk(KERN_NOTICE "%ldMB LOWMEM available./n",

pages_to_mb(max_low_pfn));
 
setup_bootmem_allocator();
 
return max_low_pfn;
}
我們在分析第一階段啓動的時候,分析過init_pg_tables_end.該值對應最初映射的頁表項末地址.注意以下幾個全局變量:
num_physpages: 總共的物理頁面數
high_memory:高端內存的起始線性地址
min_low_pfn:     可供使用最低頁面號
max_low_pfn:     低端內存的最高頁表號.也就是內核可直接使用內存的最高頁面號
函數find_max_low_pfn()找到內核直接映射的最高頁面號.在x86 32位平臺上,內核直接映射的區域爲0~896M.其它部份做高端內存映射用.
參考以下定義:
#define MAXMEM_PFN    PFN_DOWN(MAXMEM)
#define MAXMEM                       (-__PAGE_OFFSET-__VMALLOC_RESERVE)
MAXMEM_PFN即爲內核可用的最高物理頁面號.
 
接下來setup_memory()會初始化內核啓動階段的分配器了.這個分配器只在kernel初始化的時候纔會用到.
見setup_bootmem_allocator()的代碼:

void __init setup_bootmem_allocator(void)

{
         unsigned long bootmap_size;
         /*

          * Initialize the boot-time allocator (with low memory only):

          */
          //初始化bootmem

         bootmap_size = init_bootmem(min_low_pfn, max_low_pfn);

 
         register_bootmem_low_pages(max_low_pfn);
 
         /*

          * Reserve the bootmem bitmap itself as well. We do this in two

          * steps (first step was init_bootmem()) because this catches

          * the (very unlikely) case of us accidentally initializing the

          * bootmem allocator with an invalid RAM area.
          */
 
         //保留的內存
         //將bootmem 的位圖所佔區域保存.注意這個bootmem位圖所佔區域的起始地址就是init_pg_tables_end

         reserve_bootmem(__pa_symbol(_text), (PFN_PHYS(min_low_pfn) +

                             bootmap_size + PAGE_SIZE-1) - __pa_symbol(_text),

                             BOOTMEM_DEFAULT);
 
         /*

          * reserve physical page 0 - it's a special BIOS page on many boxes,

          * enabling clean reboots, SMP operation, laptop functions.

          */
          //最低的一個頁面預以保留

         reserve_bootmem(0, PAGE_SIZE, BOOTMEM_DEFAULT);

 

         /* reserve EBDA region, it's a 4K region */

         //EBDA的保留
         reserve_ebda_region();
 
         //對於AMD CPU特定區域的保留

    /* could be an AMD 768MPX chipset. Reserve a page  before VGA to prevent

       PCI prefetch into it (errata #56). Usually the page is reserved anyways,

       unless you have no PS/2 mouse plugged in. */

         if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD &&

             boot_cpu_data.x86 == 6)

              reserve_bootmem(0xa0000 - 4096, 4096, BOOTMEM_DEFAULT);

 
#ifdef CONFIG_SMP
         /*

          * But first pinch a few for the stack/trampoline stuff

          * FIXME: Don't need the extra page at 4K, but need to fix

          * trampoline before removing it. (see the GDT stuff)

          */
          //對於SMP來說,保留4K處的PAGE_SIZE大小的空間

         reserve_bootmem(PAGE_SIZE, PAGE_SIZE, BOOTMEM_DEFAULT);

#endif

#ifdef CONFIG_ACPI_SLEEP

         /*
          * Reserve low memory region for sleep support.
          */
         acpi_reserve_bootmem();
#endif

#ifdef CONFIG_X86_FIND_SMP_CONFIG

         /*

          * Find and reserve possible boot-time SMP configuration:

          */
         find_smp_config();
#endif

#ifdef CONFIG_BLK_DEV_INITRD

         reserve_initrd();
#endif
         numa_kva_reserve();
         reserve_crashkernel();
}
init_bootmem()用來初始化bootmem.
register_bootmem_low_pages()用來在bootmem中登記可用的物理內存
reserve_bootmem()用來設置bootmem的保留內存.在分配內存的時候不會將這部份內存分配出去.
 
Init_bootmem()的代碼如下所示:

unsigned long __init init_bootmem(unsigned long start, unsigned long pages)

{
         max_low_pfn = pages;
         min_low_pfn = start;

         return init_bootmem_core(NODE_DATA(0), start, 0, pages);

}
NODE_DATA()用來尋找指定序列的節點.如果沒有打開CONFIG_NUMA.節點只有一個,即爲contig_page_data

#define NODE_DATA(nid)           (&contig_page_data)

 
Init_bootmem_core():

static unsigned long __init init_bootmem_core(pg_data_t *pgdat,

         unsigned long mapstart, unsigned long start, unsigned long end)

{

         bootmem_data_t *bdata = pgdat->bdata;

         unsigned long mapsize;
 
         //分配位圖的地址

         bdata->node_bootmem_map = phys_to_virt(PFN_PHYS(mapstart));

         //分配的起始地址

         bdata->node_boot_start = PFN_PHYS(start);

         //最高頁面號
         bdata->node_low_pfn = end;
         //將bdata加到一個全局變量
         link_bootmem(bdata);
 
         /*

          * Initially all pages are reserved - setup_arch() has to

          * register free RAM areas explicitly.
          */
          //映射區域的大小
         mapsize = get_mapsize(bdata);
         //將分配位圖全部設爲1

         memset(bdata->node_bootmem_map, 0xff, mapsize);

 
         return mapsize;
}
綜合上面的幾段代碼得知.bootmem在初始化階段.將分配位圖保存在min_low_pfn中.從0開始到max_low_pfn的內存頁面對應位圖項都將置爲了1.表示頁面都不可用.
 
接着boot_mem調用register_bootmem_low_pages()在bootmem註冊可用供bootmem分配的內存.代碼如下:
//將有效內存在分配位圖中置爲空

void __init register_bootmem_low_pages(unsigned long max_low_pfn)

{
         int i;
 
         for (i = 0; i < e820.nr_map; i++) {

                   unsigned long curr_pfn, last_pfn, size;

                   /*
                    * Reserve usable low memory
                    */

                   if (e820.map[i].type != E820_RAM)

                            continue;
                   /*

                    * We are rounding up the start address of usable memory:

                    */

                   curr_pfn = PFN_UP(e820.map[i].addr);

                   if (curr_pfn >= max_low_pfn)

                            continue;
                   /*

                    * ... and at the end of the usable range downwards:

                    */

                   last_pfn = PFN_DOWN(e820.map[i].addr + e820.map[i].size);

 

                   if (last_pfn > max_low_pfn)

                            last_pfn = max_low_pfn;
 
                   /*
                    * .. finally, did all the rounding and playing
                    * around just make the area go away?
                    */
                   if (last_pfn <= curr_pfn)
                            continue;
 
                   size = last_pfn - curr_pfn;
                   //釋放掉curr_pfh -> last_pfn段的內存.對應將分配位圖中的相關位置0

                   free_bootmem(PFN_PHYS(curr_pfn), PFN_PHYS(size));

         }
}
該函數將e820位圖中可用物理內存在bootmem中對應的分配位圖全置爲0.
到這裏爲止,全部物理內存都可供bootmem分配了.但是有些內存是需要保存的.例如kernel映射所佔的內存.如果這部份內存都分配出去了,那系統肯定是會崩潰的.
 
reserve_bootmem()用來在bootmem中設置保留的頁面項.該操作實際上是將頁面在bootmem對應序號置爲0.
 
Setup_arch() -à paging_init()用來初始化分頁機制.實際上在啓動的第一階段已經分配了一小部份的頁面映射.在這裏,會進行全面的初始化

void __init paging_init(void)

{

#ifdef CONFIG_X86_PAE

         set_nx();
         if (nx_enabled)

                   printk(KERN_INFO "NX (Execute Disable) protection: active/n");

#endif
         //頁面初始化
         pagetable_init();
 
         load_cr3(swapper_pg_dir);
 
         __flush_tlb_all();
 
         //初始臨時映射區域
         kmap_init();
}
pagetable_init()中,會將在4G線性空間中,物理地址不存在的部份和內核不可直接使用部份對應的映射關係清空.然後按照PAGE_OFFSET偏移關係建立映射關係,最後再爲高端內存建立好頁面表.
然後load_cr3(swapper_pg_dir)將swapper_pg_dir再次加載到CR3.使剛纔改變的頁面映射關係生效.
Kmap_init()主要進行永久內存映射的初始化.即取得永久內存映射對應的起始頁表項.即kmap_pte
 
Setup_arch()-àzone_sizes_init()用來對zone區進行初始化.代碼如下示:

void __init zone_sizes_init(void)

{
         //設置各個區的結束頁號

         unsigned long max_zone_pfns[MAX_NR_ZONES];

         memset(max_zone_pfns, 0, sizeof(max_zone_pfns));

         max_zone_pfns[ZONE_DMA] =

                   virt_to_phys((char *)MAX_DMA_ADDRESS) >> PAGE_SHIFT;

         max_zone_pfns[ZONE_NORMAL] = max_low_pfn;

#ifdef CONFIG_HIGHMEM

         max_zone_pfns[ZONE_HIGHMEM] = highend_pfn;

         add_active_range(0, 0, highend_pfn);
#else
         add_active_range(0, 0, max_low_pfn);
#endif
 
         free_area_init_nodes(max_zone_pfns);
}
Add_acive_range()的代碼在單CPU平臺上相當於:

void __init add_active_range(unsigned int nid, unsigned long start_pfn,

                                                        unsigned long end_pfn)

{
         ……
         ……
         early_node_map[i].nid = nid;

         early_node_map[i].start_pfn = start_pfn;

         early_node_map[i].end_pfn = end_pfn;
         nr_nodemap_entries = i + 1;
        
}
即在early_node_map添加了一項,.起始頁面號是0.結束頁面號是最高物理頁面號.
接下來看free_area_init_nodes()的執行過程:

void __init free_area_init_nodes(unsigned long *max_zone_pfn)

{
         unsigned long nid;
         enum zone_type i;
 

         /* Sort early_node_map as initialisation assumes it is sorted */

         sort_node_map();
 

         //以下代碼就是爲了建立arch_zone_lowest_possible_pfn[i] ~ arch_zone_highest_possible_pfn[i]

         //對立第i個zone區的起始頁面號和最高頁面號

         memset(arch_zone_lowest_possible_pfn, 0,

                                     sizeof(arch_zone_lowest_possible_pfn));

         memset(arch_zone_highest_possible_pfn, 0,

                                     sizeof(arch_zone_highest_possible_pfn));

         arch_zone_lowest_possible_pfn[0] = find_min_pfn_with_active_regions();

         arch_zone_highest_possible_pfn[0] = max_zone_pfn[0];

         for (i = 1; i < MAX_NR_ZONES; i++) {
                   if (i == ZONE_MOVABLE)
                            continue;

                   arch_zone_lowest_possible_pfn[i] =

                            arch_zone_highest_possible_pfn[i-1];

                   arch_zone_highest_possible_pfn[i] =

                            max(max_zone_pfn[i], arch_zone_lowest_possible_pfn[i]);

         }

         arch_zone_lowest_possible_pfn[ZONE_MOVABLE] = 0;

         arch_zone_highest_possible_pfn[ZONE_MOVABLE] = 0;

 

         /* Find the PFNs that ZONE_MOVABLE begins at in each node */

         memset(zone_movable_pfn, 0, sizeof(zone_movable_pfn));

         find_zone_movable_pfns_for_nodes(zone_movable_pfn);
 
         /* Print out the zone ranges */
         printk("Zone PFN ranges:/n");
         for (i = 0; i < MAX_NR_ZONES; i++) {
                   if (i == ZONE_MOVABLE)
                            continue;
                   printk("  %-8s %8lu -> %8lu/n",
                                     zone_names[i],
                                     arch_zone_lowest_possible_pfn[i],
                                     arch_zone_highest_possible_pfn[i]);
         }
 

         /* Print out the PFNs ZONE_MOVABLE begins at in each node */

         printk("Movable zone start PFN for each node/n");

         for (i = 0; i < MAX_NUMNODES; i++) {
                   if (zone_movable_pfn[i])

                            printk("  Node %d: %lu/n", i, zone_movable_pfn[i]);

         }
 
         /* Print out the early_node_map[] */

         printk("early_node_map[%d] active PFN ranges/n", nr_nodemap_entries);

         for (i = 0; i < nr_nodemap_entries; i++)

                   printk("  %3d: %8lu -> %8lu/n", early_node_map[i].nid,

                                                        early_node_map[i].start_pfn,
                                                        early_node_map[i].end_pfn);
 
         /* Initialise every node */
         setup_nr_node_ids();
         for_each_online_node(nid) {

                   pg_data_t *pgdat = NODE_DATA(nid);

                   free_area_init_node(nid, pgdat, NULL,

                                     find_min_pfn_for_node(nid), NULL);

 
                   /* Any memory on that node */

                   if (pgdat->node_present_pages)

                            node_set_state(nid, N_HIGH_MEMORY);

                   check_for_regular_memory(pgdat);
         }
}
 
free_area_init_node()是一個比較複雜的函數.它的代碼如下:

void __paginginit free_area_init_node(int nid, struct pglist_data *pgdat,

                   unsigned long *zones_size, unsigned long node_start_pfn,

                   unsigned long *zholes_size)
{
         pgdat->node_id = nid;

         pgdat->node_start_pfn = node_start_pfn;

         //計算pgdat的總共頁面數和可用頁面數

         calculate_node_totalpages(pgdat, zones_size, zholes_size);

 
         alloc_node_mem_map(pgdat);
 

         free_area_init_core(pgdat, zones_size, zholes_size);

}
calculate_node_totalpages()用來計算節點的實際頁面數和可用頁面數.分別存放在pgdat->node_spanned_pages和pgdat->node_present_pages.
 
alloc_node_mem_map(pgdat)用來爲節點中的頁面建立描述符.

static void __init_refok alloc_node_mem_map(struct pglist_data *pgdat)

{
         /* Skip empty nodes */
         if (!pgdat->node_spanned_pages)
                   return;
 

#ifdef CONFIG_FLAT_NODE_MEM_MAP

         /* ia64 gets its own node_mem_map, before this, without bootmem */

         if (!pgdat->node_mem_map) {

                   unsigned long size, start, end;

                   struct page *map;
 
                   /*

                    * The zone's endpoints aren't required to be MAX_ORDER

                    * aligned but the node_mem_map endpoints must be in order

                    * for the buddy allocator to function correctly.

                    */

                   start = pgdat->node_start_pfn & ~(MAX_ORDER_NR_PAGES - 1);

                   end = pgdat->node_start_pfn + pgdat->node_spanned_pages;

                   end = ALIGN(end, MAX_ORDER_NR_PAGES);

                   size =  (end - start) * sizeof(struct page);

                   map = alloc_remap(pgdat->node_id, size);

                   if (!map)

                            map = alloc_bootmem_node(pgdat, size);

 
                   //pgdat中的起始page

                   pgdat->node_mem_map = map + (pgdat->node_start_pfn - start);

         }

#ifndef CONFIG_NEED_MULTIPLE_NODES

         /*

          * With no DISCONTIG, the global mem_map is just set as node 0's

          */
         if (pgdat == NODE_DATA(0)) {

                   mem_map = NODE_DATA(0)->node_mem_map;

#ifdef CONFIG_ARCH_POPULATES_NODE_MAP

                   if (page_to_pfn(mem_map) != pgdat->node_start_pfn)

                            mem_map -= (pgdat->node_start_pfn - ARCH_PFN_OFFSET);

#endif /* CONFIG_ARCH_POPULATES_NODE_MAP */

         }
#endif

#endif /* CONFIG_FLAT_NODE_MEM_MAP */

}
它根據結點中映射的頁面數目大小分配相應大小的page數組.如果是第一個結點的話,會將其page數組描述符賦值給mem_map.這也是mem_map的由來.
隨後,調用free_area_init_core()進行進一步的初始化.這個函數會比節點中的zone區進行一系列初始化,我們來關注一下頁面對應page結構的初始化.
free_area_init_core()-àinit_currently_empty_zone()-àmemmap_init():

#define memmap_init(size, nid, zone, start_pfn) /

         memmap_init_zone((size), (nid), (zone), (start_pfn), MEMMAP_EARLY)

void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,

                   unsigned long start_pfn, enum memmap_context context)

{
         struct page *page;

         unsigned long end_pfn = start_pfn + size;

         unsigned long pfn;
 

         for (pfn = start_pfn; pfn < end_pfn; pfn++) {

                   /*
                    * There can be holes in boot-time mem_map[]s
                    * handed to this function.  They do not
                    * exist on hotplugged memory.
                    */

                   if (context == MEMMAP_EARLY) {

                            if (!early_pfn_valid(pfn))

                                     continue;

                            if (!early_pfn_in_nid(pfn, nid))

                                     continue;
                   }
                   page = pfn_to_page(pfn);

                   set_page_links(page, zone, nid, pfn);

                   init_page_count(page);
                   reset_page_mapcount(page);
                   SetPageReserved(page);
 
                   /*

                    * Mark the block movable so that blocks are reserved for

                    * movable at startup. This will force kernel allocations

                    * to reserve their blocks rather than leaking throughout

                    * the address space during boot when many long-lived

                    * kernel allocations are made. Later some blocks near

                    * the start are marked MIGRATE_RESERVE by
                    * setup_zone_migrate_reserve()
                    */

                   if ((pfn & (pageblock_nr_pages-1)))

                            set_pageblock_migratetype(page, MIGRATE_MOVABLE);

 
                   INIT_LIST_HEAD(&page->lru);

#ifdef WANT_PAGE_VIRTUAL

                   /* The shift won't overflow because ZONE_NORMAL is below 4G. */

                   if (!is_highmem_idx(zone))

                            set_page_address(page, __va(pfn << PAGE_SHIFT));

#endif
         }
}
 
我們可以看到,對於每個頁面.都會經過如下初始化:

page = pfn_to_page(pfn);

set_page_links(page, zone, nid, pfn);

init_page_count(page);
reset_page_mapcount(page);
SetPageReserved(page);
 
pfn_to_page()將頁面號轉換成page結構.
set_page_links()用來設置page所屬的節點,zone, section
init_page_count()/reset_page_mapcount()用來初始頁面的相關引用計數
SetPageReserved()用來將頁面設置爲保留.
順便說一句,在init_currently_empty_zone()中會調用zone_init_free_lists()來初始化zone對應的free_area
運行到這裏之後,已經爲頁面建立好了page結構.初始化了zone區的夥伴系統.不過此時page全部置爲保留狀態.夥伴系統中的freaa_area還沒有頁面.我們繼續來看接下來的初始化.
 
在經過setup_arch()的辛勤勞動後,內存管理初具雛形,不過還沒完,還有更加重要的在後面.繼續看start_kernel() à build_all_zonelists():
該函數用來初始化節點的zonelist.在夥伴系統的分析中我們分析如,zone區的請求是有次序的,例如,要請假ZONE_HIGHMEM中的內存.如果ZONE_HIGHMEM沒有空閒內存了,就會到ZONE_NORMAL.如果還是沒有空閒內存,就會到ZONE_DMA中分配了.這個過程是由zonelist控制的,

void build_all_zonelists(void)

{
         ……
         __build_all_zonelists(NULL);
……
}
接下來往下看:

__build_all_zonelists() -à build_zonelists():
static void build_zonelists(pg_data_t *pgdat)

{
         ……
for (i = 0; i < MAX_NR_ZONES; i++) {
                   struct zonelist *zonelist;
 

                   zonelist = pgdat->node_zonelists + i;

 

                  j = build_zonelists_node(pgdat, zonelist, 0, i);

                   ……
}
……
}
Build_zonelists_node():

static int build_zonelists_node(pg_data_t *pgdat, struct zonelist *zonelist,

                                     int nr_zones, enum zone_type zone_type)

{
         struct zone *zone;
 
         BUG_ON(zone_type >= MAX_NR_ZONES);
         zone_type++;
 
         do {
                   zone_type--;

                   zone = pgdat->node_zones + zone_type;

                   if (populated_zone(zone)) {

                            zonelist->zones[nr_zones++] = zone;

                            check_highest_zone(zone_type);
                   }
 
         } while (zone_type);
         return nr_zones;
}
如上所示,在節點中,爲每一個zone區建立了一個zonelist. 這裏zonelist表示了頁面分配的先後順序
 
到這裏,夥伴系統已經全部初始化了,只要等待往裏面塞空閒頁面了.這過程是在mem_init()中完成的:

void __init mem_init(void)

{
         ……
         totalram_pages += free_all_bootmem();
         ……
set_highmem_pages_init(bad_ppro);
……
}
Free_all_bootmem():將bootmem中的所有空閒頁面釋放到夥伴系統.還會將空閒位圖所佔的內存釋放

set_highmem_pages_init(bad_ppro): 將高端頁面釋放到夥伴系統.bootmem中的頁面全部是內核可直接尋址的頁面.

.到這裏,bootmem已經失去了作用,也不可以再用了.現在夥伴系統已經可以使用了. 最後的內存初始化步驟爲初始化slab分配器.
這是在kmem_cache_init()中完成的.在這個函數裏,它會初始化cache_cache和幾個普通緩存.
 
OK.到這裏,內存初始化全部完成. ^_^



 

發佈了14 篇原創文章 · 獲贊 5 · 訪問量 23萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章