本文整理了ARM Linxu啓動流程的第二階段——start_kernel前啓動階段(彙編部分),內核版本爲3.12.35。我以手上的樹莓派b(ARM11)爲平臺示例來分析Linux內核在自解壓後到跳轉運行start_kernel之前所做的主要初始化工作:包括參數有效性驗證、創建初始頁表和MMU初始化等。
分析文件:arch/arm/kernel/head.S、head-common.S、proc-v6.S
單板:樹莓派b
在內核啓動時執行自解壓完成後,會跳轉到解壓後的地址處運行,在我的環境中就是地址0x00008000處,然後內核啓動並執行初始化。
首先給出你內核啓動的彙編部分的總流程如下:
內核啓動程序的入口:參見arch/arm/kernel/vmlinux.lds(由arch/arm/kernel/vmlinux.lds.S生成)。
arch/arm/kernel/vmlinux.lds:
- ENTRY(stext)
- jiffies = jiffies_64;
- SECTIONS
- {
- ......
- . = 0xC0000000 + 0x00008000;
- .head.text : {
- _text = .;
- *(.head.text)
- }
- .text : { /* Real text segment */
- _stext = .; /* Text and read-only data */
ENTRY(stext)
jiffies = jiffies_64;
SECTIONS
{
......
. = 0xC0000000 + 0x00008000;
.head.text : {
_text = .;
*(.head.text)
}
.text : { /* Real text segment */
_stext = .; /* Text and read-only data */
此處的TEXT_OFFSET表示內核起始地址相對於RAM地址的偏移值,定義在arch/arm/Makefile中,值爲0x00008000:
- textofs-y := 0x00008000
- ......
- # The byte offset of the kernel image in RAM from the start of RAM.
- TEXT_OFFSET := $(textofs-y)
textofs-y := 0x00008000
......
# The byte offset of the kernel image in RAM from the start of RAM.
TEXT_OFFSET := $(textofs-y)
PAGE_OFFSET表示內核虛擬地址空間的其實地址,定義在arch/arm/include/asm/memory.h中:
- #ifdef CONFIG_MMU
- /*
- * PAGE_OFFSET - the virtual address of the start of the kernel image
- * TASK_SIZE - the maximum size of a user space task.
- * TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area
- */
- #define PAGE_OFFSET UL(CONFIG_PAGE_OFFSET)
#ifdef CONFIG_MMU
/*
* PAGE_OFFSET - the virtual address of the start of the kernel image
* TASK_SIZE - the maximum size of a user space task.
* TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area
*/
#define PAGE_OFFSET UL(CONFIG_PAGE_OFFSET)
CONFIG_PAGE_OFFSET定義在arch/arm/Kconfig中,採用默認值0xC0000000。- config PAGE_OFFSET
- hex
- default 0x40000000 if VMSPLIT_1G
- default 0x80000000 if VMSPLIT_2G
- default 0xC0000000
config PAGE_OFFSET
hex
default 0x40000000 if VMSPLIT_1G
default 0x80000000 if VMSPLIT_2G
default 0xC0000000
所以,可以看出內核的鏈接地址採用的是虛擬地址,地址值爲0xC0008000。
內核啓動程序的入口在linux/arch/arm/kernel/head.S中,head.S中定義了幾個比較重要的變量,在看分析程序前先來看一下:
- /*
- * swapper_pg_dir is the virtual address of the initial page table.
- * We place the page tables 16K below KERNEL_RAM_VADDR. Therefore, we must
- * make sure that KERNEL_RAM_VADDR is correctly set. Currently, we expect
- * the least significant 16 bits to be 0x8000, but we could probably
- * relax this restriction to KERNEL_RAM_VADDR >= PAGE_OFFSET + 0x4000.
- */
- #define KERNEL_RAM_VADDR (PAGE_OFFSET + TEXT_OFFSET)
- #if (KERNEL_RAM_VADDR & 0xffff) != 0x8000
- #error KERNEL_RAM_VADDR must start at 0xXXXX8000
- #endif
- #ifdef CONFIG_ARM_LPAE
- /* LPAE requires an additional page for the PGD */
- #define PG_DIR_SIZE 0x5000
- #define PMD_ORDER 3
- #else
- #define PG_DIR_SIZE 0x4000
- #define PMD_ORDER 2
- #endif
- .globl swapper_pg_dir
- .equ swapper_pg_dir, KERNEL_RAM_VADDR - PG_DIR_SIZE
- .macro pgtbl, rd, phys
- add \rd, \phys, #TEXT_OFFSET - PG_DIR_SIZE
- .endm
/*
* swapper_pg_dir is the virtual address of the initial page table.
* We place the page tables 16K below KERNEL_RAM_VADDR. Therefore, we must
* make sure that KERNEL_RAM_VADDR is correctly set. Currently, we expect
* the least significant 16 bits to be 0x8000, but we could probably
* relax this restriction to KERNEL_RAM_VADDR >= PAGE_OFFSET + 0x4000.
*/
#define KERNEL_RAM_VADDR (PAGE_OFFSET + TEXT_OFFSET)
#if (KERNEL_RAM_VADDR & 0xffff) != 0x8000
#error KERNEL_RAM_VADDR must start at 0xXXXX8000
#endif
#ifdef CONFIG_ARM_LPAE
/* LPAE requires an additional page for the PGD */
#define PG_DIR_SIZE 0x5000
#define PMD_ORDER 3
#else
#define PG_DIR_SIZE 0x4000
#define PMD_ORDER 2
#endif
.globl swapper_pg_dir
.equ swapper_pg_dir, KERNEL_RAM_VADDR - PG_DIR_SIZE
.macro pgtbl, rd, phys
add \rd, \phys, #TEXT_OFFSET - PG_DIR_SIZE
.endm
其中KERNEL_RAM_VADDR表示內核啓動地址的虛擬地址,即前面看到的鏈接地址0xC0008000,同時內核要求這個地址的第16位必須是0x8000。
然後由於沒有配置ARM LPAE,則採用一級映射結構,頁表的大小爲16KB,頁大小爲1MB。
最後swapper_pg_dir表示初始頁表的起始地址,這個值等於內核起始虛擬地址-頁表大小=0xC0004000(內核起始地址下16KB空間存放頁表)。虛擬地址空間如下圖:需要說明一下:在我的環境中,內核在自解壓階段被解壓到了0x00008000地址處,由於內核入口鏈接地址採用的是虛擬地址0xC0008000,這兩個地址並不相同;並且此時MMU並沒有被使能,所以無法進行虛擬地址到物理地址的轉換,程序開始執行後在打開MMU前的將使用位置無關碼。
在知道了內核的入口位置後,來看一下此時的設備和寄存器的狀態:
- /*
- * Kernel startup entry point.
- * ---------------------------
- *
- * This is normally called from the decompressor code. The requirements
- * are: MMU = off, D-cache = off, I-cache = dont care, r0 = 0,
- * r1 = machine nr, r2 = atags or dtb pointer.
- *
- * This code is mostly position independent, so if you link the kernel at
- * 0xc0008000, you call this at __pa(0xc0008000).
- *
- * See linux/arch/arm/tools/mach-types for the complete list of machine
- * numbers for r1.
- *
- * We're trying to keep crap to a minimum; DO NOT add any machine specific
- * crap here - that's what the boot loader (or in extreme, well justified
- * circumstances, zImage) is for.
- */
- .arm
- __HEAD
- ENTRY(stext)
/*
* Kernel startup entry point.
* ---------------------------
*
* This is normally called from the decompressor code. The requirements
* are: MMU = off, D-cache = off, I-cache = dont care, r0 = 0,
* r1 = machine nr, r2 = atags or dtb pointer.
*
* This code is mostly position independent, so if you link the kernel at
* 0xc0008000, you call this at __pa(0xc0008000).
*
* See linux/arch/arm/tools/mach-types for the complete list of machine
* numbers for r1.
*
* We're trying to keep crap to a minimum; DO NOT add any machine specific
* crap here - that's what the boot loader (or in extreme, well justified
* circumstances, zImage) is for.
*/
.arm
__HEAD
ENTRY(stext)
註釋中說明了,此時的MMU關閉、D-cache關閉、r0 = 0、r1 = 機器碼、r2 = 啓動參數atags或dtb的地址(我的環境中使用的是atags),同時內核支持的機器碼被定義在了linux/arch/arm/tools/mach-types中。我樹莓派使用的是:
bcm2708 MACH_BCM2708 BCM2708 3138
下面來逐行分析代碼:
- THUMB( adr r9, BSYM(1f) ) @ Kernel is always entered in ARM.
- THUMB( bx r9 ) @ If this is a Thumb-2 kernel,
- THUMB( .thumb ) @ switch to Thumb now.
- THUMB(1: )
- #ifdef CONFIG_ARM_VIRT_EXT
- bl __hyp_stub_install
- #endif
- @ ensure svc mode and all interrupts masked
- safe_svcmode_maskall r9
- mrc p15, 0, r9, c0, c0 @ get processor id
- bl __lookup_processor_type @ r5=procinfo r9=cpuid
THUMB( adr r9, BSYM(1f) ) @ Kernel is always entered in ARM.
THUMB( bx r9 ) @ If this is a Thumb-2 kernel,
THUMB( .thumb ) @ switch to Thumb now.
THUMB(1: )
#ifdef CONFIG_ARM_VIRT_EXT
bl __hyp_stub_install
#endif
@ ensure svc mode and all interrupts masked
safe_svcmode_maskall r9
mrc p15, 0, r9, c0, c0 @ get processor id
bl __lookup_processor_type @ r5=procinfo r9=cpuid
這裏的safe_svcmode_maskall是一個宏,定義在arch/arm/include/asm/assembler.h中,它的作用就是確保ARM進入SVC工作模式並屏蔽所有的中斷(此時關閉中斷的原因是中斷向量表尚未建立,內核無能力響應中斷)。
然後獲取處理器ID保存到r9寄存器中,接着跳轉到__lookup_processor_type尋找對應處理器ID的proc_info地址。__lookup_processor_type定義在arch/arm/kernel/head-common.S中:
- /*
- * Read processor ID register (CP#15, CR0), and look up in the linker-built
- * supported processor list. Note that we can't use the absolute addresses
- * for the __proc_info lists since we aren't running with the MMU on
- * (and therefore, we are not in the correct address space). We have to
- * calculate the offset.
- *
- * r9 = cpuid
- * Returns:
- * r3, r4, r6 corrupted
- * r5 = proc_info pointer in physical address space
- * r9 = cpuid (preserved)
- */
- __lookup_processor_type:
- adr r3, __lookup_processor_type_data
- ldmia r3, {r4 - r6}
- sub r3, r3, r4 @ get offset between virt&phys
- add r5, r5, r3 @ convert virt addresses to
- add r6, r6, r3 @ physical address space
- 1: ldmia r5, {r3, r4} @ value, mask
- and r4, r4, r9 @ mask wanted bits
- teq r3, r4
- beq 2f
- add r5, r5, #PROC_INFO_SZ @ sizeof(proc_info_list)
- cmp r5, r6
- blo 1b
- mov r5, #0 @ unknown processor
- 2: mov pc, lr
- ENDPROC(__lookup_processor_type)
/*
* Read processor ID register (CP#15, CR0), and look up in the linker-built
* supported processor list. Note that we can't use the absolute addresses
* for the __proc_info lists since we aren't running with the MMU on
* (and therefore, we are not in the correct address space). We have to
* calculate the offset.
*
* r9 = cpuid
* Returns:
* r3, r4, r6 corrupted
* r5 = proc_info pointer in physical address space
* r9 = cpuid (preserved)
*/
__lookup_processor_type:
adr r3, __lookup_processor_type_data
ldmia r3, {r4 - r6}
sub r3, r3, r4 @ get offset between virt&phys
add r5, r5, r3 @ convert virt addresses to
add r6, r6, r3 @ physical address space
1: ldmia r5, {r3, r4} @ value, mask
and r4, r4, r9 @ mask wanted bits
teq r3, r4
beq 2f
add r5, r5, #PROC_INFO_SZ @ sizeof(proc_info_list)
cmp r5, r6
blo 1b
mov r5, #0 @ unknown processor
2: mov pc, lr
ENDPROC(__lookup_processor_type)
首先獲取處理器相關信息表的運行地址並保存到r3寄存器中。內核將所有的處理器信息都保存在proc_info_list結構體表中,它的定義如下(asm/procinfo.h):
- /*
- * Note! struct processor is always defined if we're
- * using MULTI_CPU, otherwise this entry is unused,
- * but still exists.
- *
- * NOTE! The following structure is defined by assembly
- * language, NOT C code. For more information, check:
- * arch/arm/mm/proc-*.S and arch/arm/kernel/head.S
- */
- struct proc_info_list {
- unsigned int cpu_val;
- unsigned int cpu_mask;
- unsigned long __cpu_mm_mmu_flags; /* used by head.S */
- unsigned long __cpu_io_mmu_flags; /* used by head.S */
- unsigned long __cpu_flush; /* used by head.S */
- const char *arch_name;
- const char *elf_name;
- unsigned int elf_hwcap;
- const char *cpu_name;
- struct processor *proc;
- struct cpu_tlb_fns *tlb;
- struct cpu_user_fns *user;
- struct cpu_cache_fns *cache;
- };
/*
* Note! struct processor is always defined if we're
* using MULTI_CPU, otherwise this entry is unused,
* but still exists.
*
* NOTE! The following structure is defined by assembly
* language, NOT C code. For more information, check:
* arch/arm/mm/proc-*.S and arch/arm/kernel/head.S
*/
struct proc_info_list {
unsigned int cpu_val;
unsigned int cpu_mask;
unsigned long __cpu_mm_mmu_flags; /* used by head.S */
unsigned long __cpu_io_mmu_flags; /* used by head.S */
unsigned long __cpu_flush; /* used by head.S */
const char *arch_name;
const char *elf_name;
unsigned int elf_hwcap;
const char *cpu_name;
struct processor *proc;
struct cpu_tlb_fns *tlb;
struct cpu_user_fns *user;
struct cpu_cache_fns *cache;
};
結構體中描述了CPU相關的信息,其中__cpu_mm_mmu_flags、__cpu_io_mmu_flags和__cpu_flush這三個字段將會在head.s中使用到。處理器相關信息都被保存在.init.proc.info段中:
- /*
- * Look in <asm/procinfo.h> for information about the __proc_info structure.
- */
- .align 2
- .type __lookup_processor_type_data, %object
- __lookup_processor_type_data:
- .long .
- .long __proc_info_begin
- .long __proc_info_end
- .size __lookup_processor_type_data, . - __lookup_processor_type_data
/*
* Look in <asm/procinfo.h> for information about the __proc_info structure.
*/
.align 2
.type __lookup_processor_type_data, %object
__lookup_processor_type_data:
.long .
.long __proc_info_begin
.long __proc_info_end
.size __lookup_processor_type_data, . - __lookup_processor_type_data
vmlinux.lds:
.init.proc.info : {
. = ALIGN(4); __proc_info_begin = .; *(.proc.info.init) __proc_info_end = .;
}
其中每種類型處理器的信息定義在arch/arm/mm/proc-*.S下,例如我的環境定義在proc-v6.S中:
- .section ".proc.info.init", #alloc, #execinstr
- /*
- * Match any ARMv6 processor core.
- */
- .type __v6_proc_info, #object
- _v6_proc_info:
- .long 0x0007b000
- .long 0x0007f000
- ALT_SMP(.long \
- PMD_TYPE_SECT | \
- PMD_SECT_AP_WRITE | \
- PMD_SECT_AP_READ | \
- PMD_FLAGS_SMP)
- ALT_UP(.long \
- PMD_TYPE_SECT | \
- PMD_SECT_AP_WRITE | \
- PMD_SECT_AP_READ | \
- PMD_FLAGS_UP)
- .long PMD_TYPE_SECT | \
- PMD_SECT_XN | \
- PMD_SECT_AP_WRITE | \
- PMD_SECT_AP_READ
- b __v6_setup
- .....
.section ".proc.info.init", #alloc, #execinstr
/*
* Match any ARMv6 processor core.
*/
.type __v6_proc_info, #object
__v6_proc_info:
.long 0x0007b000
.long 0x0007f000
ALT_SMP(.long \
PMD_TYPE_SECT | \
PMD_SECT_AP_WRITE | \
PMD_SECT_AP_READ | \
PMD_FLAGS_SMP)
ALT_UP(.long \
PMD_TYPE_SECT | \
PMD_SECT_AP_WRITE | \
PMD_SECT_AP_READ | \
PMD_FLAGS_UP)
.long PMD_TYPE_SECT | \
PMD_SECT_XN | \
PMD_SECT_AP_WRITE | \
PMD_SECT_AP_READ
b __v6_setup
......
回到__lookup_processor_type程序中,程序接着在r4、r5和r6中保存__lookup_processor_type_data、__proc_info_begin和__proc_info_end的鏈接地址(即虛擬地址),然後通過r3 = r3 – r4得到運行地址和鏈接地址之間的偏移值並將r5和r6中的地址值修正爲__proc_info_begin和__proc_info_end的運行地址。
然後從proc_info_list結構中取出cpu_val和cpu_mask字段的內容,和r9中保存的處理器ID進行比較,若匹配上了則通過r5寄存器返回當前處理器的proc_info_list結構信息運行地址,否則r5 = r5 + PROC_INFO_SZ(即將r5指向下一條處理器的proc_info_list結構提信息)繼續進行匹配。若全部匹配失敗,則r5返回0。- movs r10, r5 @ invalid processor (r5=0)?
- THUMB( it eq ) @ force fixup-able long branch encoding
- beq __error_p @ yes, error 'p'
movs r10, r5 @ invalid processor (r5=0)?
THUMB( it eq ) @ force fixup-able long branch encoding
beq __error_p @ yes, error 'p'
回到外層函數後,這裏會先將返回值付給r10,然後判斷是否返回值是否爲0,若爲0表示沒有匹配到對應的處理器信息,調用__error_p打印出錯信息並進入死循環,內核啓動失敗。
- #ifdef CONFIG_ARM_LPAE
- mrc p15, 0, r3, c0, c1, 4 @ read ID_MMFR0
- and r3, r3, #0xf @ extract VMSA support
- cmp r3, #5 @ long-descriptor translation table format?
- THUMB( it lo ) @ force fixup-able long branch encoding
- blo __error_p @ only classic page table format
- #endif
#ifdef CONFIG_ARM_LPAE
mrc p15, 0, r3, c0, c1, 4 @ read ID_MMFR0
and r3, r3, #0xf @ extract VMSA support
cmp r3, #5 @ long-descriptor translation table format?
THUMB( it lo ) @ force fixup-able long branch encoding
blo __error_p @ only classic page table format
#endif
這裏ARM_LAPE表示大物理內存擴展,我的環境下並沒有配置該項,暫不考慮。
- #ifndef CONFIG_XIP_KERNEL
- adr r3, 2f
- ldmia r3, {r4, r8}
- sub r4, r3, r4 @ (PHYS_OFFSET - PAGE_OFFSET)
- add r8, r8, r4 @ PHYS_OFFSET
- #else
- ldr r8, =PHYS_OFFSET @ always constant in this case
- #endif
#ifndef CONFIG_XIP_KERNEL
adr r3, 2f
ldmia r3, {r4, r8}
sub r4, r3, r4 @ (PHYS_OFFSET - PAGE_OFFSET)
add r8, r8, r4 @ PHYS_OFFSET
#else
ldr r8, =PHYS_OFFSET @ always constant in this case
#endif
這裏將計算起始RAM物理地址並保存到r8中,計算的方法同前面獲取CPU信息結構地址的方法類似,首先獲取標號爲2處的運行地址和鏈接地址(通過反彙編查看,我的環境分別是:0x00008070和0xC0008070),一減之後就得到了運行地址和物理地址的差值(0xC0000000),然後用這個差值加上PAGE_OFFSET(0xC0000000)即可得到實際物理內存的起始地址PHYS_OFFSET(0x00000000)。
現在來查看反彙編代碼,加深理解:
- c0008040: e28f3028 add r3, pc, #40 ; 0x28
- c0008044: e8930110 ldm r3, {r4, r8}
- c0008048: e0434004 sub r4, r3, r4
- c000804c: e0888004 add r8, r8, r4
- ......
- c0008070: c0008070 andgt r8, r0, r0, ror r0
- c0008074: c0000000 andgt r0, r0, r0
c0008040: e28f3028 add r3, pc, #40 ; 0x28
c0008044: e8930110 ldm r3, {r4, r8}
c0008048: e0434004 sub r4, r3, r4
c000804c: e0888004 add r8, r8, r4
......
c0008070: c0008070 andgt r8, r0, r0, ror r0
c0008074: c0000000 andgt r0, r0, r0
這裏r3 = 0x00008040 + 0x8 + 0x28 = 0x00008070,r4 =0xC0008070,r8 = 0xC0000000,在經過偏移處理後,r8的值就變成了0x00000000,即物理RAM首地址在內存地址空間中的偏移PAGE_OFFSET。(這裏有一點疑問,如果我這裏內核的運行地址並不是在0x00008000,那這個計算出道的物理RAM首地址不是就不正確了?)
- /*
- * r1 = machine no, r2 = atags or dtb,
- * r8 = phys_offset, r9 = cpuid, r10 = procinfo
- */
- bl __vet_atags
/*
* r1 = machine no, r2 = atags or dtb,
* r8 = phys_offset, r9 = cpuid, r10 = procinfo
*/
bl __vet_atags
現在來確認一下寄存器中保存內容的含義:
r1:機器碼
r2:atag或者dtb的地址
r8:物理內存地址偏移
r9:獲取到的CPU ID
r10:處理器信息結構地址
然後調用__vet_atags來驗證r2中地址值得有效性
- /* Determine validity of the r2 atags pointer. The heuristic requires
- * that the pointer be aligned, in the first 16k of physical RAM and
- * that the ATAG_CORE marker is first and present. If CONFIG_OF_FLATTREE
- * is selected, then it will also accept a dtb pointer. Future revisions
- * of this function may be more lenient with the physical address and
- * may also be able to move the ATAGS block if necessary.
- *
- * Returns:
- * r2 either valid atags pointer, valid dtb pointer, or zero
- * r5, r6 corrupted
- */
- __vet_atags:
- tst r2, #0x3 @ aligned?
- bne 1f
- ldr r5, [r2, #0]
- #ifdef CONFIG_OF_FLATTREE
- ldr r6, =OF_DT_MAGIC @ is it a DTB?
- cmp r5, r6
- beq 2f
- #endif
- cmp r5, #ATAG_CORE_SIZE @ is first tag ATAG_CORE?
- cmpne r5, #ATAG_CORE_SIZE_EMPTY
- bne 1f
- ldr r5, [r2, #4]
- ldr r6, =ATAG_CORE
- cmp r5, r6
- bne 1f
- 2: mov pc, lr @ atag/dtb pointer is ok
- 1: mov r2, #0
- mov pc, lr
- ENDPROC(__vet_atags)
/* Determine validity of the r2 atags pointer. The heuristic requires
* that the pointer be aligned, in the first 16k of physical RAM and
* that the ATAG_CORE marker is first and present. If CONFIG_OF_FLATTREE
* is selected, then it will also accept a dtb pointer. Future revisions
* of this function may be more lenient with the physical address and
* may also be able to move the ATAGS block if necessary.
*
* Returns:
* r2 either valid atags pointer, valid dtb pointer, or zero
* r5, r6 corrupted
*/
__vet_atags:
tst r2, #0x3 @ aligned?
bne 1f
ldr r5, [r2, #0]
#ifdef CONFIG_OF_FLATTREE
ldr r6, =OF_DT_MAGIC @ is it a DTB?
cmp r5, r6
beq 2f
#endif
cmp r5, #ATAG_CORE_SIZE @ is first tag ATAG_CORE?
cmpne r5, #ATAG_CORE_SIZE_EMPTY
bne 1f
ldr r5, [r2, #4]
ldr r6, =ATAG_CORE
cmp r5, r6
bne 1f
2: mov pc, lr @ atag/dtb pointer is ok
1: mov r2, #0
mov pc, lr
ENDPROC(__vet_atags)
首先驗證是否4字節地址對齊,若不對齊則直接將r2內容清空並返回。接着讀取r2地址處的內容到r5寄存器中,這裏若配置了CONFIG_OF_FLATTREE就會判斷是否是DTB,我的環境中並沒有配置。
然後進行atag的驗證,若是atag,則r2地址處的內容將保存tag_header中的size值(arch/arm/include/uapi/asm/setup.h),同時內核也要求atag信息的第一項必須是ATAT_CORE類型的項
- struct tag_header {
- __u32 size;
- __u32 tag;
- };
- ......
- struct tag_core {
- __u32 flags; /* bit 0 = read-only */
- __u32 pagesize;
- __u32 rootdev;
- };
struct tag_header {
__u32 size;
__u32 tag;
};
......
struct tag_core {
__u32 flags; /* bit 0 = read-only */
__u32 pagesize;
__u32 rootdev;
};
該CORE項的size值爲sizeof(struct tag_header) + sizeof(struct tag_core) >> 2,正好等於ATAG_CORE_SIZE:
#define ATAG_CORE_SIZE ((2*4 + 3*4) >> 2)
比較完size值後就將地址值偏移4字節讀取tag值,比較是否等於ATAG_CORE,若是則驗證通過則跳轉到標號2處直接返回。
- #ifdef CONFIG_SMP_ON_UP
- bl __fixup_smp
- #endif
- #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
- bl __fixup_pv_table
- #endif
- bl __create_page_tables
#ifdef CONFIG_SMP_ON_UP
bl __fixup_smp
#endif
#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
bl __fixup_pv_table
#endif
bl __create_page_tables
然後我這裏沒有配置CONFIG_SMP_ON_UP和CONFIG_ARM_PATCH_PHYS_VIRT選項(他們的內核配置解釋分別爲Allowbooting SMP kernel on uniprocessor systems和Patch physical tovirtual translations at runtime),接下來就要跳轉到__create_page_tables中創建初始頁表了。
- /*
- * Setup the initial page tables. We only setup the barest
- * amount which are required to get the kernel running, which
- * generally means mapping in the kernel code.
- *
- * r8 = phys_offset, r9 = cpuid, r10 = procinfo
- *
- * Returns:
- * r0, r3, r5-r7 corrupted
- * r4 = page table (see ARCH_PGD_SHIFT in asm/memory.h)
- */
- __create_page_tables:
- pgtbl r4, r8 @ page table address
/*
* Setup the initial page tables. We only setup the barest
* amount which are required to get the kernel running, which
* generally means mapping in the kernel code.
*
* r8 = phys_offset, r9 = cpuid, r10 = procinfo
*
* Returns:
* r0, r3, r5-r7 corrupted
* r4 = page table (see ARCH_PGD_SHIFT in asm/memory.h)
*/
__create_page_tables:
pgtbl r4, r8 @ page table address
這裏的註釋中說明了,創建初始頁表的過程只會創建內核代碼部分地址的頁表。
這裏的pgtbl r4, r8表示獲取存放頁表首地址的運行時地址(物理地址)到r4中去,它在反彙編中被翻譯成:
c0008078 <__create_page_tables>:
c0008078: e2884901 add r4, r8, #16384 ; 0x4000
可見這裏的r4值就是0x00004000,正好是頁表的起始物理地址。
- /*
- * Clear the swapper page table
- */
- mov r0, r4
- mov r3, #0
- add r6, r0, #PG_DIR_SIZE
- 1: str r3, [r0], #4
- str r3, [r0], #4
- str r3, [r0], #4
- str r3, [r0], #4
- teq r0, r6
- bne 1b
/*
* Clear the swapper page table
*/
mov r0, r4
mov r3, #0
add r6, r0, #PG_DIR_SIZE
1: str r3, [r0], #4
str r3, [r0], #4
str r3, [r0], #4
str r3, [r0], #4
teq r0, r6
bne 1b
然後將頁表內存空間清零,從0x00004000~0x00008000的空間都清零。
- #ifdef CONFIG_ARM_LPAE
- /*
- * Build the PGD table (first level) to point to the PMD table. A PGD
- * entry is 64-bit wide.
- */
- mov r0, r4
- add r3, r4, #0x1000 @ first PMD table address
- orr r3, r3, #3 @ PGD block type
- mov r6, #4 @ PTRS_PER_PGD
- mov r7, #1 << (55 - 32) @ L_PGD_SWAPPER
- 1:
- #ifdef CONFIG_CPU_ENDIAN_BE8
- str r7, [r0], #4 @ set top PGD entry bits
- str r3, [r0], #4 @ set bottom PGD entry bits
- #else
- str r3, [r0], #4 @ set bottom PGD entry bits
- str r7, [r0], #4 @ set top PGD entry bits
- #endif
- add r3, r3, #0x1000 @ next PMD table
- subs r6, r6, #1
- bne 1b
- add r4, r4, #0x1000 @ point to the PMD tables
- #ifdef CONFIG_CPU_ENDIAN_BE8
- add r4, r4, #4 @ we only write the bottom word
- #endif
- #endif
#ifdef CONFIG_ARM_LPAE
/*
* Build the PGD table (first level) to point to the PMD table. A PGD
* entry is 64-bit wide.
*/
mov r0, r4
add r3, r4, #0x1000 @ first PMD table address
orr r3, r3, #3 @ PGD block type
mov r6, #4 @ PTRS_PER_PGD
mov r7, #1 << (55 - 32) @ L_PGD_SWAPPER
1:
#ifdef CONFIG_CPU_ENDIAN_BE8
str r7, [r0], #4 @ set top PGD entry bits
str r3, [r0], #4 @ set bottom PGD entry bits
#else
str r3, [r0], #4 @ set bottom PGD entry bits
str r7, [r0], #4 @ set top PGD entry bits
#endif
add r3, r3, #0x1000 @ next PMD table
subs r6, r6, #1
bne 1b
add r4, r4, #0x1000 @ point to the PMD tables
#ifdef CONFIG_CPU_ENDIAN_BE8
add r4, r4, #4 @ we only write the bottom word
#endif
#endif
由於沒有配置ARM_LPAE,這一部分內容先暫時不做分析,接着往下看。
- ldr r7, [r10, #PROCINFO_MM_MMUFLAGS] @ mm_mmuflags
- /*
- * Create identity mapping to cater for __enable_mmu.
- * This identity mapping will be removed by paging_init().
- */
- adr r0, __turn_mmu_on_loc
- ldmia r0, {r3, r5, r6}
- sub r0, r0, r3 @ virt->phys offset
- add r5, r5, r0 @ phys __turn_mmu_on
- add r6, r6, r0 @ phys __turn_mmu_on_end
- mov r5, r5, lsr #SECTION_SHIFT
- mov r6, r6, lsr #SECTION_SHIFT
ldr r7, [r10, #PROCINFO_MM_MMUFLAGS] @ mm_mmuflags
/*
* Create identity mapping to cater for __enable_mmu.
* This identity mapping will be removed by paging_init().
*/
adr r0, __turn_mmu_on_loc
ldmia r0, {r3, r5, r6}
sub r0, r0, r3 @ virt->phys offset
add r5, r5, r0 @ phys __turn_mmu_on
add r6, r6, r0 @ phys __turn_mmu_on_end
mov r5, r5, lsr #SECTION_SHIFT
mov r6, r6, lsr #SECTION_SHIFT
這裏開始創建特殊映射來滿足開啓MMU的需求,該映射將會在內核後續初始化執行paging_init()時被銷燬。
首先從處理器的procinfo結構中獲取__cpu_mm_mmu_flags參數保存在r7中,然後獲取標號__turn_mmu_on_loc處的運行地址保存到r0中,然後使用前文中類似的手段獲得__trun_mmu_on和__trun_mmu_on_end入口處的實際運行的物理地址保存到r5和r6寄存器中。
__turn_mmu_on_loc:
.long .
.long __turn_mmu_on
.long __turn_mmu_on_end
然後由於我的環境中沒有開啓LAPE,採用一級映射方式,映射單位爲1M,所以這裏的SECTION_SHIFT爲20。這裏對r5和r6中的值右移20位,得到了__trun_mmu_on和__trun_mmu_on_end的物理基地址。
- 1: orr r3, r7, r5, lsl #SECTION_SHIFT @ flags + kernel base
- str r3, [r4, r5, lsl #PMD_ORDER] @ identity mapping
- cmp r5, r6
- addlo r5, r5, #1 @ next section
- blo 1b
1: orr r3, r7, r5, lsl #SECTION_SHIFT @ flags + kernel base
str r3, [r4, r5, lsl #PMD_ORDER] @ identity mapping
cmp r5, r6
addlo r5, r5, #1 @ next section
blo 1b
這裏將r5左移20位後或上r7中的標識位,得到了對應的First-level descriptor(即也表中存放的一級描述符,參見《ARM Linux啓動流程分析——內核自解壓階段》),然後將這個描述符寫到頁表中對應的項中去。
這裏包括__trun_mmu_on和__trun_mmu_on_end之間地址空間的特殊映射方式同樣採用的是1:1映射,因此這裏計算對應頁表向的方式如下:
頁表地址 = 映射物理基址 << PMD_ORDER(2)
例如:我環境中__trun_mmu_on物理地址爲0xc0433398,它的基地址爲0xc04,轉換爲對應的表項爲0x3010,所以會在0x00004000+0x3010處的頁表地址中寫入“頁描述符”,該描述描述符中的基址同樣爲0xc04。在進行虛擬地址到物理地址的轉換過程中,即可實現x線性轉換(轉換方式參見《ARM Linux啓動流程分析——內核自解壓階段》)。
如此循環映射完整個__turn_mmu_on部分代碼,映射後的地址空間如下圖:
- /*
- * Map our RAM from the start to the end of the kernel .bss section.
- */
- add r0, r4, #PAGE_OFFSET >> (SECTION_SHIFT - PMD_ORDER)
- ldr r6, =(_end - 1)
- orr r3, r8, r7
- add r6, r4, r6, lsr #(SECTION_SHIFT - PMD_ORDER)
- : str r3, [r0], #1 << PMD_ORDER
- add r3, r3, #1 << SECTION_SHIFT
- cmp r0, r6
- bls 1b
/*
* Map our RAM from the start to the end of the kernel .bss section.
*/
add r0, r4, #PAGE_OFFSET >> (SECTION_SHIFT - PMD_ORDER)
ldr r6, =(_end - 1)
orr r3, r8, r7
add r6, r4, r6, lsr #(SECTION_SHIFT - PMD_ORDER)
1: str r3, [r0], #1 << PMD_ORDER
add r3, r3, #1 << SECTION_SHIFT
cmp r0, r6
bls 1b
映射完開啓MMU部分的代碼後,接下來開始映射內核。
首先將PAGE_OFFSET(0xc0008000)右移(20-2)位在加上r4(頁表物理基地址)得到內核起始鏈接地址對應頁表項的物理地址,保存到r0中。
接着獲取內核代碼的結束虛擬地址(包括了bss段)保存到r6中,_end定義在vmlinux.lds中:
- _edata_loc = __data_loc + SIZEOF(.data);
- . = ALIGN(0); __bss_start = .; . = ALIGN(0); .sbss : AT(ADDR(.sbss) - 0) { *(.sbss) *(.scommon) } . = ALIGN(0); .bss : AT(ADDR(.bss) - 0) { *(.bss..page_aligned) *(.dynbss) *(.bss) *(COMMON) } . = ALIGN(0); __bss_stop = .;
- _end = .;
- .stab 0 : { *(.stab) } .stabstr 0 : { *(.stabstr) } .stab.excl 0 : { *(.stab.excl) } .stab.exclstr 0 : { *(.stab.exclstr) } .stab.index 0 : { *(.stab.index) } .stab.indexstr 0 : { *(.stab.indexstr) } .comment 0 : { *(.comment) }
- .comment 0 : { *(.comment) }
_edata_loc = __data_loc + SIZEOF(.data);
. = ALIGN(0); __bss_start = .; . = ALIGN(0); .sbss : AT(ADDR(.sbss) - 0) { *(.sbss) *(.scommon) } . = ALIGN(0); .bss : AT(ADDR(.bss) - 0) { *(.bss..page_aligned) *(.dynbss) *(.bss) *(COMMON) } . = ALIGN(0); __bss_stop = .;
_end = .;
.stab 0 : { *(.stab) } .stabstr 0 : { *(.stabstr) } .stab.excl 0 : { *(.stab.excl) } .stab.exclstr 0 : { *(.stab.exclstr) } .stab.index 0 : { *(.stab.index) } .stab.indexstr 0 : { *(.stab.indexstr) } .comment 0 : { *(.comment) }
.comment 0 : { *(.comment) }
}
在我的環境中,它的值爲0xc116c517。
接着將r7或上r8得到First-leveldescriptor保存到r3中(該值的高12位爲0),然後計算內核結束虛擬地址對應應頁表項的物理地址保存到r6中。接下來的代碼將r0~r6中的頁表項循環循環填充上需要映射的First-level descriptor,每一次循環都會將r3+1<<SECTION_SHIFT,即加上基地址增量。建立映射表後的內存映射關係如下:
- /*
- * Then map boot params address in r2 if specified.
- * We map 2 sections in case the ATAGs/DTB crosses a section boundary.
- */
- mov r0, r2, lsr #SECTION_SHIFT
- movs r0, r0, lsl #SECTION_SHIFT
- subne r3, r0, r8
- addne r3, r3, #PAGE_OFFSET
- addne r3, r4, r3, lsr #(SECTION_SHIFT - PMD_ORDER)
- orrne r6, r7, r0
- strne r6, [r3], #1 << PMD_ORDER
- addne r6, r6, #1 << SECTION_SHIFT
- strne r6, [r3]
/*
* Then map boot params address in r2 if specified.
* We map 2 sections in case the ATAGs/DTB crosses a section boundary.
*/
mov r0, r2, lsr #SECTION_SHIFT
movs r0, r0, lsl #SECTION_SHIFT
subne r3, r0, r8
addne r3, r3, #PAGE_OFFSET
addne r3, r4, r3, lsr #(SECTION_SHIFT - PMD_ORDER)
orrne r6, r7, r0
strne r6, [r3], #1 << PMD_ORDER
addne r6, r6, #1 << SECTION_SHIFT
strne r6, [r3]
在映射完內核之後就需要映射內核啓動參數了,內核的啓動參數地址保存在r2中。在前面的程序中已經對r2中啓動參數地址的有效性進行了驗證,如果無效則現在r2中的值就是0,將不做映射操作。
此處代碼中的前兩行就是爲了判斷該值是否爲0,如果不爲0才進行映射操作。
首先獲取啓動參數地址相對於物理RAM的偏移值並保存到r3中,然後再對該值加上PAGE_OFFSET(0xc0000000)得到其所需映射到的虛擬地址,接着找到對應的頁表和生成First-level descriptor,最後連續寫入連續的兩項頁表項來完成2頁的映射。也就是說不論內核啓動參數有多大,這裏默認只映射2MB的內存。 mov pc, lr
映射完3個內存區間後,我這裏過濾掉其他未定義的條件編譯項後直接看到代碼中執行返回操作,初始頁表創建完成。
我的環境中內核啓動參數的地址爲0x00000100,所以映射結果如下:
- /*
- * The following calls CPU specific code in a position independent
- * manner. See arch/arm/mm/proc-*.S for details. r10 = base of
- * xxx_proc_info structure selected by __lookup_processor_type
- * above. On return, the CPU will be ready for the MMU to be
- * turned on, and r0 will hold the CPU control register value.
- */
- ldr r13, =__mmap_switched @ address to jump to after
- @ mmu has been enabled
- adr lr, BSYM(1f) @ return (PIC) address
- mov r8, r4 @ set TTBR1 to swapper_pg_dir
- ARM( add pc, r10, #PROCINFO_INITFUNC )
- THUMB( add r12, r10, #PROCINFO_INITFUNC )
- THUMB( mov pc, r12 )
- 1: b __enable_mmu
/*
* The following calls CPU specific code in a position independent
* manner. See arch/arm/mm/proc-*.S for details. r10 = base of
* xxx_proc_info structure selected by __lookup_processor_type
* above. On return, the CPU will be ready for the MMU to be
* turned on, and r0 will hold the CPU control register value.
*/
ldr r13, =__mmap_switched @ address to jump to after
@ mmu has been enabled
adr lr, BSYM(1f) @ return (PIC) address
mov r8, r4 @ set TTBR1 to swapper_pg_dir
ARM( add pc, r10, #PROCINFO_INITFUNC )
THUMB( add r12, r10, #PROCINFO_INITFUNC )
THUMB( mov pc, r12 )
1: b __enable_mmu
這裏首先保存__mmap_switched函數的鏈接地址(虛擬地址)到r13中,它是MMU開啓後的第一個要跳轉運行的虛擬地址。
然後保存返回地址爲下文中標號1處的地址,此處爲b __enable_mmu;接着保存r4中的頁表物理地址到r8寄存其中,最後就跳轉到架構相關的處理器初始化函數中執行初始化,爲開啓MMU做準備工作;在執行完初始化函數後,將返回到lr保存的地址運行,開啓MMU。
這裏的PROCINFO_INITFUNC宏定義爲16,此時PC的值正好爲參數__cpu_flush的值
- struct proc_info_list {
- unsigned int cpu_val;
- unsigned int cpu_mask;
- unsigned long __cpu_mm_mmu_flags; /* used by head.S */
- unsigned long __cpu_io_mmu_flags; /* used by head.S */
- unsigned long __cpu_flush; /* used by head.S */
- const char *arch_name;
struct proc_info_list {
unsigned int cpu_val;
unsigned int cpu_mask;
unsigned long __cpu_mm_mmu_flags; /* used by head.S */
unsigned long __cpu_io_mmu_flags; /* used by head.S */
unsigned long __cpu_flush; /* used by head.S */
const char *arch_name;
- __v6_proc_info:
- .long 0x0007b000
- .long 0x0007f000
- ALT_SMP(.long \
- PMD_TYPE_SECT | \
- PMD_SECT_AP_WRITE | \
- PMD_SECT_AP_READ | \
- PMD_FLAGS_SMP)
- ALT_UP(.long \
- PMD_TYPE_SECT | \
- PMD_SECT_AP_WRITE | \
- PMD_SECT_AP_READ | \
- PMD_FLAGS_UP)
- .long PMD_TYPE_SECT | \
- PMD_SECT_XN | \
- PMD_SECT_AP_WRITE | \
- PMD_SECT_AP_READ
- b __v6_setup
__v6_proc_info:
.long 0x0007b000
.long 0x0007f000
ALT_SMP(.long \
PMD_TYPE_SECT | \
PMD_SECT_AP_WRITE | \
PMD_SECT_AP_READ | \
PMD_FLAGS_SMP)
ALT_UP(.long \
PMD_TYPE_SECT | \
PMD_SECT_AP_WRITE | \
PMD_SECT_AP_READ | \
PMD_FLAGS_UP)
.long PMD_TYPE_SECT | \
PMD_SECT_XN | \
PMD_SECT_AP_WRITE | \
PMD_SECT_AP_READ
b __v6_setup
在前文中已經看到,我的環境中在proc-v6.S已經定義了參數__cpu_flush的內容爲b __v6_setup,所以這裏會執行__v6_setup函數:
- /*
- * __v6_setup
- *
- * Initialise TLB, Caches, and MMU state ready to switch the MMU
- * on. Return in r0 the new CP15 C1 control register setting.
- *
- * We automatically detect if we have a Harvard cache, and use the
- * Harvard cache control instructions insead of the unified cache
- * control instructions.
- *
- * This should be able to cover all ARMv6 cores.
- *
- * It is assumed that:
- * - cache type register is implemented
- */
- __v6_setup:
/*
* __v6_setup
*
* Initialise TLB, Caches, and MMU state ready to switch the MMU
* on. Return in r0 the new CP15 C1 control register setting.
*
* We automatically detect if we have a Harvard cache, and use the
* Harvard cache control instructions insead of the unified cache
* control instructions.
*
* This should be able to cover all ARMv6 cores.
*
* It is assumed that:
* - cache type register is implemented
*/
__v6_setup:
__v6_setup函數主要是配置CPU的寄存器,這裏不再詳細分析了,函數將會初始化TLB、Cache以及開啓MMU的一些必要的狀態(例如將頁表物理地址設置到TTB:Translation Table Base中),然後通過r0寄存器返回CP15 C1控制寄存器中的設置值。
- /*
- * Setup common bits before finally enabling the MMU. Essentially
- * this is just loading the page table pointer and domain access
- * registers.
- *
- * r0 = cp#15 control register
- * r1 = machine ID
- * r2 = atags or dtb pointer
- * r4 = page table (see ARCH_PGD_SHIFT in asm/memory.h)
- * r9 = processor ID
- * r13 = *virtual* address to jump to upon completion
- */
- __enable_mmu:
- #if defined(CONFIG_ALIGNMENT_TRAP) && __LINUX_ARM_ARCH__ < 6
- orr r0, r0, #CR_A
- #else
- bic r0, r0, #CR_A
- #endif
- #ifdef CONFIG_CPU_DCACHE_DISABLE
- bic r0, r0, #CR_C
- #endif
- #ifdef CONFIG_CPU_BPREDICT_DISABLE
- bic r0, r0, #CR_Z
- #endif
- #ifdef CONFIG_CPU_ICACHE_DISABLE
- bic r0, r0, #CR_I
- #endif
/*
* Setup common bits before finally enabling the MMU. Essentially
* this is just loading the page table pointer and domain access
* registers.
*
* r0 = cp#15 control register
* r1 = machine ID
* r2 = atags or dtb pointer
* r4 = page table (see ARCH_PGD_SHIFT in asm/memory.h)
* r9 = processor ID
* r13 = *virtual* address to jump to upon completion
*/
__enable_mmu:
#if defined(CONFIG_ALIGNMENT_TRAP) && __LINUX_ARM_ARCH__ < 6
orr r0, r0, #CR_A
#else
bic r0, r0, #CR_A
#endif
#ifdef CONFIG_CPU_DCACHE_DISABLE
bic r0, r0, #CR_C
#endif
#ifdef CONFIG_CPU_BPREDICT_DISABLE
bic r0, r0, #CR_Z
#endif
#ifdef CONFIG_CPU_ICACHE_DISABLE
bic r0, r0, #CR_I
#endif
完成了上面的準備工作後就要開啓MMU了,本質上這裏僅僅是加載了頁表指針和域訪問控制寄存器。
首先這裏根據內核配置選項再對r0中返回的CR15 CR1進行配置,這些宏定義在arch/arm/include/asm/cp15.h中,表示了寄存器每一位的定義,其中部分內容如下:
- #define CR_M (1 << 0) /* MMU enable */
- #define CR_A (1 << 1) /* Alignment abort enable */
- #define CR_C (1 << 2) /* Dcache enable */
- #define CR_W (1 << 3) /* Write buffer enable */
- #define CR_P (1 << 4) /* 32-bit exception handler */
- #define CR_D (1 << 5) /* 32-bit data address range */
- #define CR_L (1 << 6) /* Implementation defined */
- #define CR_B (1 << 7) /* Big endian */
- #define CR_S (1 << 8) /* System MMU protection */
- #define CR_R (1 << 9) /* ROM MMU protection */
- #define CR_F (1 << 10) /* Implementation defined */
- #define CR_Z (1 << 11) /* Implementation defined */
- #define CR_I (1 << 12) /* Icache enable */
- #define CR_V (1 << 13) /* Vectors relocated to 0xffff0000 */
- #define CR_RR (1 << 14) /* Round Robin cache replacement */
- #define CR_L4 (1 << 15) /* LDR pc can set T bit */
- #define CR_DT (1 << 16)
- ......
#define CR_M (1 << 0) /* MMU enable */
#define CR_A (1 << 1) /* Alignment abort enable */
#define CR_C (1 << 2) /* Dcache enable */
#define CR_W (1 << 3) /* Write buffer enable */
#define CR_P (1 << 4) /* 32-bit exception handler */
#define CR_D (1 << 5) /* 32-bit data address range */
#define CR_L (1 << 6) /* Implementation defined */
#define CR_B (1 << 7) /* Big endian */
#define CR_S (1 << 8) /* System MMU protection */
#define CR_R (1 << 9) /* ROM MMU protection */
#define CR_F (1 << 10) /* Implementation defined */
#define CR_Z (1 << 11) /* Implementation defined */
#define CR_I (1 << 12) /* Icache enable */
#define CR_V (1 << 13) /* Vectors relocated to 0xffff0000 */
#define CR_RR (1 << 14) /* Round Robin cache replacement */
#define CR_L4 (1 << 15) /* LDR pc can set T bit */
#define CR_DT (1 << 16)
......
例如,如果內核CONFIG_CPU_DCACHE_DISABLE,則這裏會清除CR_C位來使D-Cache失能等等。
- #ifndef CONFIG_ARM_LPAE
- mov r5, #(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | \
- domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
- domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
- domain_val(DOMAIN_IO, DOMAIN_CLIENT))
- mcr p15, 0, r5, c3, c0, 0 @ load domain access register
#ifndef CONFIG_ARM_LPAE
mov r5, #(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | \
domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
domain_val(DOMAIN_IO, DOMAIN_CLIENT))
mcr p15, 0, r5, c3, c0, 0 @ load domain access register
接下來這裏設置域訪問控制寄存器,ARM處理器使用域管理內存訪問權限,它將虛擬內存區域劃分爲幾個區域,爲每個區域附於訪問控制權限來進行保護和控制。
從手冊中看到,L1頁描述符中第[5:8]位標識所在的域(4位最大能表示16個域)。域訪問控制寄存器CP15 C3爲一個32位寄存器:
其中每一項Dx佔兩位(共16項),表示該域的權限,其含義如下:
代碼中domain_val宏定義如下(arch/arm/include/asm/domian.h):
- #define DOMAIN_KERNEL 0
- #define DOMAIN_TABLE 0
- #define DOMAIN_USER 1
- #define DOMAIN_IO 2
- ......
- #define DOMAIN_NOACCESS 0
- #define DOMAIN_CLIENT 1
- #ifdef CONFIG_CPU_USE_DOMAINS
- #define DOMAIN_MANAGER 3
- #else
- #define DOMAIN_MANAGER 1
- #endif
- ......
- #define domain_val(dom,type) ((type) << (2*(dom)))
#define DOMAIN_KERNEL 0
#define DOMAIN_TABLE 0
#define DOMAIN_USER 1
#define DOMAIN_IO 2
......
#define DOMAIN_NOACCESS 0
#define DOMAIN_CLIENT 1
#ifdef CONFIG_CPU_USE_DOMAINS
#define DOMAIN_MANAGER 3
#else
#define DOMAIN_MANAGER 1
#endif
......
#define domain_val(dom,type) ((type) << (2*(dom)))
由於沒有配置CPU_USE_DOMAINS,所以不使用Manager權限展開後r5和c3寄存器的值爲
0x00000015,對應關係如下:
D0 —— DOMAIN_KERNEL\DOMAIN_TABLE—— DOMAIN_CLIENT
D1 —— DOMAIN_USER —— DOMAIN_CLIENT
D2 —— DOMAIN_IO —— DOMAIN_CLIENT
mcr p15, 0, r4, c2, c0, 0 @ load page table pointer
#endif
b __turn_mmu_on
ENDPROC(__enable_mmu)
接下來將r4中保存的頁表物理地址寫入C2的TTB中,其實這一步驟已經在__v6_setup中已經做過了,這裏重複了一次,需要注意的是r4中頁表物理地址的低5位已經被處理過了(頁表地址必須32字節對齊),用於設置PD0和PD1等,C2寄存器如下:
在設置完TTB後,就要跳轉到__turn_mmu_on開啓MMU了。
- ENTRY(__turn_mmu_on)
- mov r0, r0
- instr_sync
- mcr p15, 0, r0, c1, c0, 0 @ write control reg
- mrc p15, 0, r3, c0, c0, 0 @ read id reg
- instr_sync
- mov r3, r3
- mov r3, r13
- mov pc, r3
- __turn_mmu_on_end:
ENTRY(__turn_mmu_on)
mov r0, r0
instr_sync
mcr p15, 0, r0, c1, c0, 0 @ write control reg
mrc p15, 0, r3, c0, c0, 0 @ read id reg
instr_sync
mov r3, r3
mov r3, r13
mov pc, r3
__turn_mmu_on_end:
這裏首先將前面配置過的r0寄存器寫入C1寄存器中,如此MMU就被啓動了,然後將前面保存在r13中的__mmap_switched函數的鏈接地址(虛擬地址)賦值到r3中,最後跳轉到__mmap_switched函數執行。
注意:由於在執行461行代碼(mcr p15, 0, r0, c1, c0, 0)後MMU已經開啓了,CPU在取指時已經採用虛擬地址,需經過頁表的轉換,但是此時的PC寄存器的值卻還是按原來的順序取指(例如在執行461行代碼時,我的環境中PC的值爲0x004333a0+8),也即如果不對__turn_mmu_on函數進行線性1:1映射的話,0x00XXXXXX處的地址無法解析,程序將無法繼續運行。
- /*
- * The following fragment of code is executed with the MMU on in MMU mode,
- * and uses absolute addresses; this is not position independent.
- *
- * r0 = cp#15 control register
- * r1 = machine ID
- * r2 = atags/dtb pointer
- * r9 = processor ID
- */
- __INIT
- __mmap_switched:
- adr r3, __mmap_switched_data
- ldmia r3!, {r4, r5, r6, r7}
- cmp r4, r5 @ Copy data segment if needed
- 1: cmpne r5, r6
- ldrne fp, [r4], #4
- strne fp, [r5], #4
- bne 1b
/*
* The following fragment of code is executed with the MMU on in MMU mode,
* and uses absolute addresses; this is not position independent.
*
* r0 = cp#15 control register
* r1 = machine ID
* r2 = atags/dtb pointer
* r9 = processor ID
*/
__INIT
__mmap_switched:
adr r3, __mmap_switched_data
ldmia r3!, {r4, r5, r6, r7}
cmp r4, r5 @ Copy data segment if needed
1: cmpne r5, r6
ldrne fp, [r4], #4
strne fp, [r5], #4
bne 1b
在跳轉到__mmap_switched後,頁表建立完畢,MMU處於激活狀態,將使用絕對地址執行,不再採用位置無關代碼,所以從這裏開始也就不需要再區分鏈接地址和實際的運行物理地址了。
首先將__mmap_switched_data的地址保存到r3中,然後將__data_loc、_sdata、__bss_start和_end變量的地址保存到r4、r5、r6和r7寄存器中。注意r3後面的歎號,r3的值會遞增。
- .align 2
- .type __mmap_switched_data, %object
- __mmap_switched_data:
- .long __data_loc @ r4
- .long _sdata @ r5
- .long __bss_start @ r6
- .long _end @ r7
- .long processor_id @ r4
- .long __machine_arch_type @ r5
- .long __atags_pointer @ r6
- #ifdef CONFIG_CPU_CP15
- .long cr_alignment @ r7
- #else
- .long 0 @ r7
- #endif
- .long init_thread_union + THREAD_START_SP @ sp
- .size __mmap_switched_data, . - __mmap_switched_data
.align 2
.type __mmap_switched_data, %object
__mmap_switched_data:
.long __data_loc @ r4
.long _sdata @ r5
.long __bss_start @ r6
.long _end @ r7
.long processor_id @ r4
.long __machine_arch_type @ r5
.long __atags_pointer @ r6
#ifdef CONFIG_CPU_CP15
.long cr_alignment @ r7
#else
.long 0 @ r7
#endif
.long init_thread_union + THREAD_START_SP @ sp
.size __mmap_switched_data, . - __mmap_switched_data
它們的地址值在我的環境中如下:
- c05ab2ac <__mmap_switched_data>:
- c05ab2ac: c1080000 mrsgt r0, (UNDEF: 8)
- c05ab2b0: c1080000 mrsgt r0, (UNDEF: 8)
- c05ab2b4: c10bccec smlattgt fp, ip, ip, ip
- c05ab2b8: c116c518 tstgt r6, r8, lsl r5
c05ab2ac <__mmap_switched_data>:
c05ab2ac: c1080000 mrsgt r0, (UNDEF: 8)
c05ab2b0: c1080000 mrsgt r0, (UNDEF: 8)
c05ab2b4: c10bccec smlattgt fp, ip, ip, ip
c05ab2b8: c116c518 tstgt r6, r8, lsl r5
然後比較__data_loc和_sdata的值是否一致,若不一致則需要拷貝數據段。其中__data_loc是內核鏡像中數據段的存儲位置,在開啓CONFIG_XIP_KERNEL後該值不等於_sdata值,在vmlinux.lds.S中定義如下:
- #ifdef CONFIG_XIP_KERNEL
- __data_loc = ALIGN(4); /* location in binary */
- . = PAGE_OFFSET + TEXT_OFFSET;
- #else
- __init_end = .;
- . = ALIGN(THREAD_SIZE);
- __data_loc = .;
- #endif
- .data : AT(__data_loc) {
- _data = .; /* address in memory */
- _sdata = .;
#ifdef CONFIG_XIP_KERNEL
__data_loc = ALIGN(4); /* location in binary */
. = PAGE_OFFSET + TEXT_OFFSET;
#else
__init_end = .;
. = ALIGN(THREAD_SIZE);
__data_loc = .;
#endif
.data : AT(__data_loc) {
_data = .; /* address in memory */
_sdata = .;
而_sdata是數據段的鏈接位置。若開啓CONFIG_XIP_KERNEL,則這兩個值不等,所以需要將數據段拷貝到鏈接地址處(在RAM中)。在我的環境中,這兩個值一致,不需要拷貝。
- mov fp, #0 @ Clear BSS (and zero fp)
- 1: cmp r6, r7
- strcc fp, [r6],#4
- bcc 1b
- ARM( ldmia r3, {r4, r5, r6, r7, sp})
- THUMB( ldmia r3, {r4, r5, r6, r7} )
- THUMB( ldr sp, [r3, #16] )
- str r9, [r4] @ Save processor ID
- str r1, [r5] @ Save machine type
- str r2, [r6] @ Save atags pointer
- cmp r7, #0
- bicne r4, r0, #CR_A @ Clear 'A' bit
- stmneia r7, {r0, r4} @ Save control register values
- b start_kernel
- ENDPROC(__mmap_switched)
mov fp, #0 @ Clear BSS (and zero fp)
1: cmp r6, r7
strcc fp, [r6],#4
bcc 1b
ARM( ldmia r3, {r4, r5, r6, r7, sp})
THUMB( ldmia r3, {r4, r5, r6, r7} )
THUMB( ldr sp, [r3, #16] )
str r9, [r4] @ Save processor ID
str r1, [r5] @ Save machine type
str r2, [r6] @ Save atags pointer
cmp r7, #0
bicne r4, r0, #CR_A @ Clear 'A' bit
stmneia r7, {r0, r4} @ Save control register values
b start_kernel
ENDPROC(__mmap_switched)
接下來首先清空BSS段,然後將processor_id、__machine_arch_type、__atags_pointer、cr_alignment和init_thread_union + THREAD_START_SP值依次讀取到r4、r5、r6、r7和sp中。其中processor_id、__machine_arch_type和__atags_pointer是定義在arch/arm/kerne/setup.c中的全局變量,分別用於保存處理器ID、bootloader傳入的機器ID和啓動參數地址。
然後依次將r9、r1和r2中保存的相應參數寫入到這些全局變量中去,以便後面執行start_kernel之後的程序中使用到。
接着判斷C7中的值是否爲0,由於我的環境中已經配置了CONFIG_CPU_CP15,所以該值不爲0,保存的是cr_alignment全局變量的地址。cr_alignment全局變量被定義在arch/arm/kernel/entry-armv.S中,和他一起使用的還有cr_no_alignment。他們分別被用來保存啓用和禁用“A域”的CP15 C1寄存器值。
“A域”的作用是控制是否啓用訪問對齊檢查,若開啓,在未對齊的訪問內存會發生data abort trap。程序中分別將啓用A域和禁用A域的C1寄存器值保存到cr_alignment和cr_no_alignment全局變量中以備後續使用。
最後跳轉到start_kernel函數進行進一步的初始化動作,內核啓動的彙編部分到這裏結束。最後總結一下這部分主要完成了以下初始化:
(1)驗證處理器ID、內核啓動參數等地址的有效性;
(2)創建初始頁表,完成內核代碼、啓動參數和內核啓動MMU代碼這3部分內存映射;
(3)開啓MMU並保存參數。
參考文獻:1、《ARM Linux內核源碼剖析》
2、《ARM11 數據手冊》