《WINDOWSPE權威指南》學習筆記(二)- PE文件結構及字段說明

學完Win32asm編程後,發現《PE》書中第二章給出的小工具的界面功能其實可以更豐富些,也不難,之後一定會給改一下。今天看了第三章的大部分,到PE編程前停住了,下面把筆記貼下來,還有一些MSDN的註釋,理解起來可能更充分吧。

MSDN中的一張PE結構的圖:
1
	1.5
		初識PE文件
			目的:使EXE文件能在不同的CPU工作指令下工作(跟編譯環境有關),應該不是移植性
			意思是,編譯出的文件的格式統一
		大部分exe和dll文件都是PE文件
		
		靜態程序
			偏移	內容
			0x0000	PE頭
			0x0400	代碼段
			0x0600	引入函數段
			0x0800	數據段
3
	3.1
		PE的數據組織方式
			頭部 + 身體
			結構 + 	字節碼
	3.2
		地址
			VA虛擬內存地址
				進程的基地址 + RVA 
			RVA相對虛擬內存地址
			FOA文件偏移地址
				和內存無關,指靜態的偏移
			特殊地址
				從某個特定的位置算起
		指針
			存儲地址的字段是指針
		對齊
			內存對齊
				節在內存中的對齊大小至少是一個頁的大小,32位4KB(1000h),64位8KB(2000h)
			文件對齊
				節在磁盤上以一個扇區521B(200h)對齊
			資源數據對齊
				4個字節
		Unicode字符串
			容量
			使用範圍:資源文件
			結尾判斷:不一定以\0結尾
	3.3-3.4 
		DOS MZ頭
			由IMAGE_DOS_HEADER定義,編譯器自動生成(64B),DOS下參數
		DOS 塊
			在DOS下的可執行代碼,編譯器自動生成,可自己修改(虛擬機可試驗)
		PE頭
			IMAGE_DOS_HEADER中e_lfanew字段(DOS MZ頭的最後一個雙字字段)定位PE頭的位置
			
			PE頭標識(4B):PE\0\0
			標準PE頭(20B)
				由IMAGE_FILE_HEADER定義,定義了全局信息			
			拓展PE頭
				由IMAGE_OPTIONAL_HEADER32定義,定義了更詳細的信息	
				其中最後一個字段IMAGE_DATA_DIRECTORY定義了不同數據的RVA和長度
		節表(40B×n)
			由IMAGE_SECTION_HEADER定義,定義了節的信息
	3.5
		Signature:PE\0\0//改寫系統將無法加載,防病毒自啓動
		
		IMAGE_FILE_HEADER
			WORD Machine//機型不符提示不是有效的Win32程序
				0x14d Intel i860 
				0x14c Intel I386 (same ID used for 486 and 586) 
				0x162 MIPS R3000 
				0x166 MIPS R4000 
				0x183 DEC Alpha AXP 
			WORD NumberOfSections//不能小於1,不能超過96,除沒有節之外必須與實際對應,否則提示不是有效的Win32程序 
				The number of sections in the file. 
			DWORD TimeDateStamp//沒啥用,跟“創建時間”“修改時間”“訪問時間”無關 
				The time that the linker (or compiler for an OBJ file) produced this file. 
				This field holds the number of seconds since December 31st, 1969, at 4:00 P.M. 
			DWORD PointerToSymbolTable//此值爲0
				The file offset of the COFF symbol table. This field is only used in OBJ files and PE files with COFF debug information.
				PE files support multiple debug formats, so debuggers should refer to the IMAGE_DIRECTORY_ENTRY_DEBUG entry in the data directory (defined later). 
			DWORD NumberOfSymbols//此值爲0
				The number of symbols in the COFF symbol table. See above. 
			WORD SizeOfOptionalHeader//32位00e0h,64位00f0h,跟CPU無關,看程序設定 
				The size of IMAGE_OPTIONAL_HEADER32. 
				In OBJs, the field is 0. 
			WORD Characteristics//文件屬性,一位代表一個信息 
				可執行文件:010fh//可執行,不包含重定位信息,不含符號和行號信息,只支持32位
				dll文件:210eh//DLL,不包含重定位信息,不含符號和行號信息,只支持32位
				
		IMAGE_OPTIONAL_HEADER32
			WORD Magic//文件類型 
				PE32:0x010B 
				ROM映像:0x0107 
				PE32+:0x020B//64位 
			BYTE MajorLinkerVersion 
			BYTE MinorLinkerVersion 
				The version of the linker that produced this file. 
				The numbers should be displayed as decimal values, rather than as hex. 
				A typical linker version is 2.23. 
			DWORD SizeOfCode//文件對齊,代碼節的總和,512B的倍數 
				The combined and rounded-up size of all the code sections. 
				Usually, most files only have one code section, so this field matches the size of the .text section. 
			DWORD SizeOfInitializedData 
				This is supposedly the total size of all the sections that are composed of initialized data (not including code segments.)
				However, it doesn't seem to be consistent with what appears in the file. 
			DWORD SizeOfUninitializedData 
				The size of the sections that the loader commits space for in the virtual address space, 
				but that don't take up any space in the disk file. 
				These sections don't need to have specific values at program startup, hence the term uninitialized data. 
				Uninitialized data usually goes into a section called .bss. 
			DWORD AddressOfEntryPoint//啓動地址,RVA相對於整個文件的基址,病毒程序、加密程序、補丁程序會劫持這個值 
				The address where the loader will begin execution.
				This is an RVA, and usually can usually be found in the .text section. 
			DWORD BaseOfCode//.text代碼節的起始地址
				The RVA where the file's code sections begin.
				The code sections typically come before the data sections and after the PE header in memory.
				This RVA is usually 0x1000 in Microsoft Linker-produced EXEs. 
				Borland's TLINK32 looks like it adds the image base to the RVA of the first code section and stores the result in this field. 
			DWORD BaseOfData//.data數據節的起始地址 
				The RVA where the file's data sections begin.
				The data sections typically come last in memory, after the PE header and the code sections. 
			DWORD ImageBase//優先裝入地址,無需重定位,可執行文件0x40 0000,DLL文件0x1000 0000 
				When the linker creates an executable, it assumes that the file will be memory-mapped to a specific location in memory. 
				That address is stored in this field, assuming a load address allows linker optimizations to take place. 
				If the file really is memory-mapped to that address by the loader, the code doesn't need any patching before it can be run.
				In executables produced for Windows NT, the default image base is 0x10000. For DLLs, the default is 0x400000. 
				In Windows 95, the address 0x10000 can't be used to load 32-bit EXEs because it lies within a linear address region shared by all processes. 
				Because of this, Microsoft has changed the default base address for Win32 executables to 0x400000. 
				Older programs that were linked assuming a base address of 0x10000 will take longer to load under Windows 95 
				because the loader needs to apply the base relocations. 
			DWORD SectionAlignment//內存地址對齊長度,32位0x1000,64位0x2000 
				When mapped into memory, each section is guaranteed to start at a virtual address that's a multiple of this value. 
				For paging purposes, the default section alignment is 0x1000. 
			DWORD FileAlignment//文件地址對齊長度,0x0200 = 512B,扇區的大小 
				In the PE file, the raw data that comprises each section is guaranteed to start at a multiple of this value. 
				The default value is 0x200 bytes, 
				probably to ensure that sections always start at the beginning of a disk sector(which are also 0x200 bytes in length).
				This field is equivalent to the segment/resource alignment size in NE files. 
				Unlike NE files, PE files typically don't have hundreds of sections, 
				so the space wasted by aligning the file sections is almost always very small. 
			WORD MajorOperatingSystemVersion//略 
			WORD MinorOperatingSystemVersion//略  
				The minimum version of the operating system required to use this executable.
				This field is somewhat ambiguous since the subsystem fields (a few fields later) appear to serve a similar purpose. 
				This field defaults to 1.0 in all Win32 EXEs to date. 
			WORD MajorImageVersion//略  
			WORD MinorImageVersion//略  
				A user-definable field. 
				This allows you to have different versions of an EXE or DLL. You set these fields via the linker /VERSION switch. 
				For example, "LINK /VERSION:2.0 myobj.obj". 
			WORD MajorSubsystemVersion//略  
			WORD MinorSubsystemVersion//略  
				Contains the minimum subsystem version required to run the executable.
				A typical value for this field is 3.10 (meaning Windows NT 3.1). 
			DWORD Reserved1//略  
				Seems to always be 0. 
			DWORD SizeOfImage//在內存中的映射尺寸,文件頭1000h + 1000h × 節數量 
				This appears to be the total size of the portions of the image that the loader has to worry about. 
				It is the size of the region starting at the image base up to the end of the last section. 
				The end of the last section is rounded up to the nearest multiple of the section alignment. 
			DWORD SizeOfHeaders//在有頭+節表在文件對齊後的大小,200h的倍數 
				The size of the PE header and the section (object) table. 
				The raw data for the sections starts immediately after all the header components. 
			DWORD CheckSum//校驗和,一般PE爲0,內核驅動和系統DLL不爲0 
				Supposedly a CRC checksum of the file. As in other Microsoft executable formats, this field is ignored and set to 0. 
				The one exception to this rule is for trusted services and these EXEs must have a valid checksum. 
			WORD Subsystem//界面子系統 
				The type of subsystem that this executable uses for its user interface.
				WINNT.H defines the following values: 
				NATIVE  1 Doesn't require a subsystem (such as a device driver) 
				WINDOWS_GUI  2 Runs in the Windows GUI subsystem 
				WINDOWS_CUI  3 Runs in the Windows character subsystem (a console app) 
				OS2_CUI  5 Runs in the OS/2 character subsystem (OS/2 1.x apps only) 
				POSIX_CUI  7 Runs in the Posix character subsystem 
			WORD DllCharacteristics//文件裝載屬性 
				A set of flags indicating under which circumstances a DLL's initialization function (such as DllMain) will be called. 
				This value appears to always be set to 0, yet the operating system still calls the DLL initialization function for all four events. 
					The following values are defined:
					1  Call when DLL is first loaded into a process's address space 
					2  Call when a thread terminates 
					4  Call when a thread starts up 
					8  Call when DLL exits 
			DWORD SizeOfStackReserve//初始化棧時保留的大小,1M			
				The amount of virtual memory to reserve for the initial thread's stack.
				Not all of this memory is committed, however (see the next field). 
				This field defaults to 0x100000 (1MB). 
				If you specify 0 as the stack size to CreateThread, the resulting thread will also have a stack of this same size. 
			DWORD SizeOfStackCommit//初始化棧時提交的大小, 4K 
				The amount of memory initially committed for the initial thread's stack.
				This field defaults to 0x1000 bytes (1 page) for the Microsoft Linker while TLINK32 makes it two pages. 
			DWORD SizeOfHeapReserve//初始化堆時保留的大小 
				The amount of virtual memory to reserve for the initial process heap.
				This heap's handle can be obtained by calling GetProcessHeap. Not all of this memory is committed (see the next field). 
			DWORD SizeOfHeapCommit//初始化堆時提交的大小 
				The amount of memory initially committed in the process heap. The default is one page. 
			DWORD LoaderFlags//調試支持,一般爲0
				From WINNT.H, these appear to be fields related to debugging support.
				I've never seen an executable with either of these bits enabled, nor is it clear how to get the linker to set them.
				The following values are defined: 
					1. Invoke a breakpoint instruction before starting the process 
					2. Invoke a debugger on the process after it's been loaded 
			DWORD NumberOfRvaAndSizes//數據目錄中結構的數量,一般爲0010h 
				The number of entries in the DataDirectory array (below). This value is always set to 16 by the current tools. 
			IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES] //數據目錄結構數組
				0導出表,在.edata,包含導出函數和資源
				1導入表,在.idata,包含導入符號
				2異常表,在.pdata,包含異常處理函數表項數組
				3資源表,在.rsrc,包含各種資源的地址,多層二叉排序樹
				4屬性證書表,包含屬性證書表項
				5基址重定位信息表,在.reloc,包含重定位信息
				6調試表,在.debug,包含IMAGE_DEBUG_DERECTORY結構數組
				7預留,必須爲0
				8GlobalPtr,全局指針寄存器的值
				9tls技術用
				10seh技術用
				11綁定導入數據表
				12導入地址表
				13延遲但如數據表
				14clr數據表,在.cormeta,.net框架用
				15系統預留,未定義
				
		IMAGE_SECTION_HEADER
			BYTE Name[IMAGE_SIZEOF_SHORT_NAME]//節名 
				This is an 8-byte ANSI name (not UNICODE) that names the section. 
				Most section names start with a . (such as ".text"), but this is not a requirement, as some PE documentation would have you believe.
				You can name your own sections with either the segment directive in assembly language, 
				or with "#pragma data_seg" and "#pragma code_seg" in the Microsoft C/C++ compiler. 
				It's important to note that if the section name takes up the full 8 bytes, there's no NULL terminator byte. 
				If you're a printf devotee, you can use %.8s to avoid copying the name string to another buffer where you can NULL-terminate it. 
			union { DWORD PhysicalAddress 	DWORD VirtualSize } Misc;//節對齊前的真實尺寸
				This field has different meanings, in EXEs or OBJs.
				In an EXE, it holds the actual size of the code or data. 
				This is the size before rounding up to the nearest file alignment multiple. 
				The SizeOfRawData field (seems a bit of a misnomer) later on in the structure holds the rounded up value. 
				The Borland linker reverses the meaning of these two fields and appears to be correct. 
				For OBJ files, this field indicates the physical address of the section. 
				The first section starts at address 0. 
				To find the physical address in an OBJ file of the next section, add the SizeOfRawData value to the physical address of the current section. 
			DWORD VirtualAddress//節的RVA地址 
				In EXEs, this field holds the RVA to where the loader should map the section.
				To calculate the real starting address of a given section in memory,
				add the base address of the image to the section's VirtualAddress stored in this field. 
				With Microsoft tools, the first section defaults to an RVA of 0x1000. In OBJs, this field is meaningless and is set to 0. 
			DWORD SizeOfRawData//節在文件對齊後的大小 
				In EXEs, this field contains the size of the section after it's been rounded up to the file alignment size. 
				For example, assume a file alignment size of 0x200. 
				If the VirtualSize field from above says that the section is 0x35A bytes in length, 
				this field will say that the section is 0x400 bytes long. 
				In OBJs, this field contains the exact size of the section emitted by the compiler or assembler. 
				In other words, for OBJs, it's equivalent to the VirtualSize field in EXEs. 
			DWORD PointerToRawData//節在文件對齊後的偏移地址 
				This is the file-based offset of where the raw data emitted by the compiler or assembler can be found. 
				If your program memory maps a PE or COFF file itself (rather than letting the operating system load it), 
				this field is more important than the VirtualAddress field. 
				You'll have a completely linear file mapping in this situation, so you'll find the data for the sections at this offset, 
				rather than at the RVA specified in the VirtualAddress field. 
			DWORD PointerToRelocations//指向重定位表的指針,可執行文件中爲0 
				In OBJs, this is the file-based offset to the relocation information for this section. 
				The relocation information for each OBJ section immediately follows the raw data for that section. 
				In EXEs, this field (and the subsequent field) are meaningless, and set to 0. 
				When the linker creates the EXE, it resolves most of the fixups, 
				leaving only base address relocations and imported functions to be resolved at load time. 
				The information about base relocations and imported functions is kept in their own sections, 
				so there's no need for an EXE to have per-section relocation data following the raw section data. 
			DWORD PointerToLinenumbers//指向行號表,調試用 
				This is the file-based offset of the line number table. 
				A line number table correlates source file line numbers to the addresses of the code generated for a given line.
				In modern debug formats like the CodeView format, line number information is stored as part of the debug information. 
				In the COFF debug format, however, the line number information is stored separately from the symbolic name/type information. 
				Usually, only code sections (such as .text) have line numbers. 
				In EXE files, the line numbers are collected towards the end of the file, after the raw data for the sections. 
				In OBJ files, the line number table for a section comes after the raw section data and the relocation table for that section. 
			WORD NumberOfRelocations//重定位表的個數 
				The number of relocations in the relocation table for this section (the PointerToRelocations field from above). 
				This field seems relevant only for OBJ files. 
			WORD NumberOfLinenumbers//行號的數量 
				The number of line numbers in the line number table for this section (the PointerToLinenumbers field from above). 
			DWORD Characteristics//節屬性 
				What most programmers call flags, the COFF/PE format calls characteristics. 
				This field is a set of flags that indicate the section's attributes (such as code/data, readable, or writeable,).
				For a complete list of all possible section attributes, see the IMAGE_SCN_XXX_XXX #defines in WINNT.H. 
				代碼節一般爲0x6000 0020,可執行,可讀,節中包含代碼
				數據節一般爲0xc000 0040,可讀,可寫,包含已初始化數據
				常量節一般爲0x4000 0040,可讀,包含已初始化數據
				資源節同常量節


發佈了38 篇原創文章 · 獲贊 3 · 訪問量 9336
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章