9.3. ELF Header

9.3. ELF Header

The first part of any ELF file (including object files like foo.o) is the ELF header. There are several ways to look at the header. First, we’ll use a program that dumps the raw data in hexadecimal and ascii (a text representation) for a file to see if there is anything that we can recognize.

任何ELF文件的第一部分 (包括類似於 foo. o 的對象文件) 是ELF頭。有幾種方法可以查看ELF頭。首先, 我們將使用一個程序, 將原始數據以十六進制和 ascii (文本表示形式) 轉儲到文件中, 以查看是否有我們可以識別的內容。

Code View: Scroll / Show All

penguin> hexdump -C foo.o | head

00000000  7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|

00000010  01 00 03 00 01 00 00 00 00 00 00 00 00 00 00 00 |................|

00000020  58 03 00 00 00 00 00 00 34 00 00 00 00 00 28 00 |X.......4.....(.|

00000030  12 00 0f 00 55 89 e5 83 ec 08 c7 45 fc 00 00 00 |....U......E....|

00000040  00 83 ec 0c ff 75 08 e8 fc ff ff ff 83 c4 10 03 |.....u..........|

00000050  05 00 00 00 00 89 45 fc 8b 15 04 00 00 00 8d 45 |......E........E|

00000060  fc 01 10 8d 45 fc 83 00 05 8b 45 fc c9 c3 55 89 |....E.....E...U.|

00000070  e5 83 ec 08 83 ec 0c ff 75 08 e8 b5 ff ff ff 83 |........u.......|

00000080  c4 10 83 ec 0c 68 20 00 00 00 e8 fc ff ff ff 83 |.....h.........|

00000090  c4 10 b8 00 00 00 00 c9 c3 90 55 89 e5 83 ec 08 |..........U.....|

 

 

                                                

Note: hexdump is used in this chapter to show a raw hex dump of ELF files. The od tool can also be used.

注意: 本章使用 hexdump 顯示 ELF 文件的原始十六進制內容。還可以使用 od 工具。

 

At first glance, the only thing recognizable is the “ELF” text at the beginning of the file in the ascii part of the output. We can confirm visually that this is an ELF file, but in order to understand the rest, we need to look at the structure of the ELF header.

乍一看, 唯一可識別的是在輸出的 ascii 部分中文件開頭的 "ELF" 文本。我們可以直觀地確認這是一個 ELF 文件, 但爲了瞭解其餘部分, 我們需要查看 ELF 頭的結構。

The structure for the ELF header is contained in various papers on the ELF specification as well as the /usr/include/elf.h file on Linux. The structure listed here is for 32-bit ELF files (refer to the elf.h header file to see the 64-bit version):

ELF頭的結構包含在關於 ELF 標準的各種文件中, 以及 Linux 上的 /usr/include/elf.h 文件。此處列出的結構用於32位 ELF 文件 (請參閱elf. h 文件以查看64位版本):

#define EI_NIDENT      16

 

typedef struct {

        unsigned char  e_ident[EI_NIDENT]; /* ident bytes */

        Elf32_Half     e_type;             /* file type */

        Elf32_Half     e_machine;          /* target machine */

        Elf32_Word     e_version;          /* file version */

        Elf32_Addr     e_entry;            /* start address */

        Elf32_Off      e_phoff;            /* phdr file offset */

        Elf32_Off      e_shoff;            /* shdr file offset */

        Elf32_Word     e_flags;            /* file flags */

        Elf32_Half     e_ehsize;           /* sizeof ehdr */

        Elf32_Half     e_phentsize;        /* sizeof phdr */

        Elf32_Half     e_phnum;            /* number phdrs */

        Elf32_Half     e_shentsize;        /* sizeof shdr */

        Elf32_Half     e_shnum;            /* number shdrs */

        Elf32_Half     e_shstrndx;         /* shdr string index */

} Elf32_Ehdr;

 

If we map this structure to the raw output from the hex dump, we see that the first 16 bytes is for the e_ident field, and the first four bytes include the text “ELF.” In fact, every ELF file contains the first four bytes 0x7f, E, L, and F to identify the file type as ELF. This is called a magic number. Magic numbers are used in many file formats, and the command file foo.o (referenced earlier in the chapter) used this magic number to identify the object file as an ELF file (see the /etc/magic file or read the man page for magic for more information on magic numbers).

如果我們將此結構映射到十六進制轉儲的原始輸出, 我們將看到前16個字節爲 e_ident 字段, 前四個字節包含文本 "ELF"。實際上, 每個 ELF 文件都包含前四個字節0x7f、E、L 和 F 來標識文件類型爲 ELF。這被稱爲一個magic數字。magic數字用在許多文件格式, 和命令文件 foo.o (在本章前面提到). 使用這個magic數字辨認對象文件爲ELF文件 (有關magic數的更多信息,請參閱/ etc / magic文件或閱讀magic手冊頁)。

Here are the fields in the ident array (one byte per field) of the ELF header:

下面是 ELF 頭的識別數組中的字段 (每個字段的一個字節):

0. 0x7f

1. E

2. L

3. F

4. EI_CLASS       : ELF Class

5. EI_DATA        : Data encoding: big or little endian

6. EI_VERSION     : Must be EV_CURRENT (value of 1)

7. EI_OSABI       : Application binary interface (ABI) type

8. EI_ABIVERSION  : ABI version

9. EI_PAD         : Start of padding bytes (continues until end of array)

 

From /usr/include/elf.h, we have:

#define ELFCLASSNONE   0           /* EI_CLASS */

#define ELFCLASS32     1

#define ELFCLASS64     2

#define ELFCLASSNUM    3

 

#define ELFDATANONE    0           /* e_ident[EI_DATA] */

#define ELFDATA2LSB    1

#define ELFDATA2MSB    2

 

#define EV_NONE        0           /* e_version, EI_VERSION */

#define EV_CURRENT     1

 

Let’s use the data from a hex dump to map these three values from the ELF file foo.o:

讓我們使用十六進制轉儲中的數據來映射ELF文件foo.o中的這三個值:

Code View: Scroll / Show All

penguin> hexdump -C foo.o | head -1

00000000   7f  45 4c  46  01  01 01 00  00 00 00 00 00 00 00 00 |.ELF............|

 

According to the hex dump, the “class” field byte at offset 4 (starting from 0x0) is 1, the data encoding field (byte at offset 5) is 1, and the version (byte at offset 6) is 1.

根據十六進制轉儲, 在偏移 4 (從0x0 開始) 的 "class" 字段字節爲 1, 數據編碼字段 (位於偏移量5) 爲 1, 並且版本 (位於偏移量 6) 爲1。

Thus, the class is 32-bit (ELFCLASS32), the data encoding is LSB (Least Significant Bit) (ELFDATA2LSB) or “little endian,” and the version is EV_CURRENT (which it must be).

因此, 該class是32位 (ELFCLASS32), 數據編碼爲 LSB (最小有效位) (ELFDATA2LSB) 或 "little endian", 版本爲 EV_CURRENT (必須是)。

We can also map the next couple of fields in the ELF header structure using the raw output. The next two fields of e_type and e_machine are 16 bytes (EI_NIDENT) and 18 (EI_NIDENT + 2) bytes past the beginning of the file (at offset 0x10 in hexadecimal):

我們還可以使用原始輸出映射ELF頭結構中的下幾個字段。 e_type和e_machine的下兩個字段是16個字節(EI_NIDENT)和18個(EI_NIDENT + 2)個字節,超過文件的開始部分(偏移量爲十六進制0x10)::

penguin> hexdump -C foo.o |head -2

00000000 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|

00000010 01 00 03 00 01 00 00 00 00 00 00 00 00 00 00 00 |................|

 

From the output, the e_type field is 0x1 (elf.h: ET_REL), which is used for relocatable files. The e_machine field is 0x3 (elf.h: EM_386), which is used for executables that run on the x86 architecture.

在輸出中, e_type 字段爲 0x1 (elf.h: ET_REL), 用於可重定位的文件。e_machine 字段是 0x3 (elf.h: EM_386), 用於在 x86 體系結構上運行的可執行文件。

Note: Because this platform is little endian, the byte order must be reversed to be translated into the big endian format—the format that humans are generally more comfortable with. For little endian, a hex dumped value of 0100 is actually 0001 or 0x1 (little endian is covered in more detail in the GDB chapter of this book).

注意: 因爲這個平臺是little endian, 所以必須顛倒的字階轉換成big endin格式-人類更適應的格式。對於little endian, 0100 的十六進制值實際上是0001或 0x1 (在本書的 GDB 章節中, 對little endian有更詳細地介紹)。

 

Mapping the ELF structure to the raw hex and ASCII output certainly works, but it is inconvenient and shows that there is no real magic or mystery behind the ELF object types. Fortunately, Linux provides a much easier way to display the ELF header:

將ELF結構映射到原始十六進制和ASCII輸出當然有效,但它不方便並且表明ELF對象類型背後沒有真正的魔力或神祕感。幸運的是, Linux 提供了一種更簡單的方式來顯示 ELF 頭:

penguin> readelf -h foo.o

ELF Header:

  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00

  Class:                             ELF32

  Data:                              2's complement, little endian

  Version:                           1 (current)

  OS/ABI:                            UNIX - System V

  ABI Version:                       0

  Type:                              REL (Relocatable file)

  Machine:                           Intel 80386

  Version:                           0x1

  Entry point address:               0x0

  Start of program headers:          0 (bytes into file)

  Start of section headers:          856 (bytes into file)

  Flags:                             0x0

  Size of this header:               52 (bytes)

  Size of program headers:           0 (bytes)

  Number of program headers:         0

  Size of section headers:           40 (bytes)

  Number of section headers:         18

  Section header string table index: 15

 

The last 10 values in the output correspond directly to the last 10 fields in the ELF header structure but without the work of having to find and format the information by hand.

輸出中的最後10個值直接對應於 ELF 頭結構中的最後10個字段, 但無需手工查找和格式化信息。

First let’s take a look at the difference between the ELF header for different ELF file types. We’ll look at object files (which we just looked at), shared libraries, executables, and core files.

首先讓我們來看看不同類型 ELF 文件的 ELF 頭的區別。我們將查看對象文件 (我們剛纔看過的)、共享庫、可執行文件和core文件。

Here is the ELF header for an executable:

下面是可執行文件的 ELF 頭:

 penguin> readelf -h foo

ELF Header:

  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00

  Class:                             ELF32

  Data:                              2's complement, little endian

  Version:                           1 (current)

  OS/ABI:                            UNIX - System V

  ABI Version:                       0

  Type:                              EXEC (Executable file)

  Machine:                           Intel 80386

  Version:                           0x1

  Entry point address:               0x8048540

  Start of program headers:          52 (bytes into file)

  Start of section headers:          9292 (bytes into file)

  Flags:                             0x0

  Size of this header:               52 (bytes)

  Size of program headers:           32 (bytes)

  Number of program headers:         7

  Size of section headers:           40 (bytes)

  Number of section headers:         35

  Section header string table index: 32

 

Besides the obvious difference that the e_type is EXEC and not REL, as it was for the object file, the e_entry and e_phoff fields are also defined for the executable. This is the information needed to load the executable into memory and start it running. This information is missing from object files, which is one of the reasons they cannot be run directly.

除了 e_type 是 EXEC 而不是REL,這個明顯差異外, 與對象文件一樣, e_entry 和 e_phoff 字段也定義爲可執行。這是將可執行文件加載到內存並開始運行所需的信息。對象文件中缺少此信息, 這是它們無法直接運行的原因之一。

The e_entry (entry point) field contains the virtual address of the starting function for an ELF file. This field is usually only used for executable files. For executable files on Linux, this field contains the address of the _start() function, which runs before main() and ensures the proper start up of the executable. Eventually, _start() calls main() to hand control over to the user written code. Using the nm utility, we can display the symbol table and confirm that the _start function is at 0x08048540. When the executable first starts up, this is the first function that is called in the executable.

e_entry (入口點) 字段包含 ELF 文件的起始函數的虛擬地址。此字段通常只用於可執行文件。對於 Linux 上的可執行文件, 此字段包含 _start () 函數的地址, 它在 main () 之前運行, 並確保可執行程序的正確啓動。最終, _start () 調用 main () 將控制權交給用戶編寫的代碼。使用 nm 工具, 我們可以顯示符號表, 並確認 _start 函數在0x08048540。當可執行文件第一次啓動時, 這是在可執行文件中調用的第一個函數。

penguin> ls -l foo

-rwxr-xr-x    1 wilding  build    12609 Jan  9 11:30 foo

penguin> nm foo | egrep ' _start$'

08048540 T _start

Note: _start is a special function that initializes a new running process. It is run before main().

注意: _start 是一個特殊的函數, 它初始化一個新的運行進程。它在main () 之前運行。

 

One thing worth noting is that the “offset” (first field) in the nm output is larger than the file itself. The foo executable is only 12609 (0x3141) bytes, although nm is suggesting that _start() is at offset 0x08048540. The reason for this is that ELF provides the ability to specify a load address for a segment of an ELF file. The load address is the address where a segment should be loaded into memory. On Linux (x86 architecture), the load address for the segment that contains the machine instructions of a 32-bit executable is 0x08048000. This address is platform-specific and defined as part of the ABI (application binary interface). This value is added to the offsets of the symbols to provide the value displayed by nm. For more information on the load address, refer to the heading, “Segments and the Program Header Table,” later in the chapter.

值得注意的一點是, nm 輸出中的 "偏移" (第一個字段) 大於文件本身。foo 可執行文件只有 12609 (0x3141) 字節, 雖然 nm 建議 _start () 在偏移0x08048540。其原因是, ELF 提供了指定 ELF 的段的加載地址的能力。加載地址是將段加載到內存中的地址。在 Linux (x86 體系結構) 上, 32位可執行文件的機器指令的段的加載地址是0x08048000 。此地址是特定於平臺的, 並定義爲 ABI (應用程序二進制接口) 的一部分。此值將添加到符號的偏移量, 以提供由 nm 顯示的值。有關加載地址的詳細信息, 請參閱本章後面的標題 "段和程序頭表"。

The e_phoff (“start of program headers”) field contains the file offset for the program header table. The program header table is required for executables and shared libraries and defines the various segments in an ELF file. A segment is a contiguous part or range of an ELF object and has specific memory attributes such as read, write, and execute. A segment is meant to be loaded into memory with the corresponding memory attributes. The e_phentsize (“size of program headers”) field defines the size of an entry in the program header table. The e_phnum (“number of program headers”) field defines the number of entries in the program header table. All entries in the program header table have the same fixed size.

e_phoff ("程序頭起始部分") 字段包含程序頭表的文件偏移量。程序頭表是可執行文件和共享庫所必需的, 並定義了 ELF 文件中的各個段。段是 ELF 對象的連續部分或範圍, 具有特定的內存屬性, 如讀取、寫入和執行。段是要被加載到具有相應的屬性的內存中。e_phentsize ("程序頭大小") 字段定義了程序頭表中條目的大小。e_phnum ("程序頭數") 字段定義了程序頭表中的條目數。程序頭表中的所有項都具有相同的固定大小。

Note: The only part of an ELF file that has a fixed location is the ELF header. All other parts of an ELF header are referenced by offset starting with the offsets listed in the ELF header.

注意: ELF文件中唯一具有固定位置的部分是 ELF 頭。ELF頭的所有其他部分都通過從 ELF頭中列出的偏移量進行訪問。

 

The ELF header for a shared library is similar to that of an executable, and in fact, the two file types are almost identical. A core file, on the other hand, has some significant differences. A core file is the memory image of a once-running process. Because there is no need to execute it, there is no need for sections of the core file to contain machine instructions. There is a need, however, to load parts of a core file into memory (for example, when using a debugger), and thus there are some program headers (segments).

共享庫的 ELF 頭與可執行文件類似, 實際上, 這兩種文件類型幾乎相同。另一方面, core文件存在一些顯著的差異。core文件是一個運行的進程的內存映像。因爲沒有必要執行它, 所以不需要core文件包含機器指令。但是, 需要將core文件的一部分加載到內存中 (例如, 在使用調試器時), 因此有一些程序頭 (段)。

penguin> ls -l core

-rw———    1 wilding build    184320 Oct 14 16:36 core

penguin> file core

core: ELF 32-bit LSB core file of  'excp' (signal 6), Intel 80386,

version 1 (SYSV), from 'excp'

penguin> readelf -h core |tail

  Entry point address:               0x0

  Start of program headers:          52 (bytes into file)

  Start of section headers:          0 (bytes into file)

  Flags:                             0x0

  Size of this header:               52 (bytes)

  Size of program headers:           32 (bytes)

  Number of program headers:         17

  Size of section headers:           0 (bytes)

  Number of section headers:         0

  Section header string table index: 0

 

Notice that there is no entry point and no section headers. Sections and segments are two different types of ELF file parts and really deserve a good explanation.

注意, 沒有入口點, 也沒有節頭。節和段是兩種不同類型的 ELF 文件部分, 需要好好的解釋這兩者的區別。

9.4. Overview of Segments and Sections

An ELF file can be interpreted in two ways: as a set of segments or as a set of sections. Sections are smaller pieces of an ELF file that contain very specific information, such as the machine instructions or the symbol table. Segments are larger groupings of one or more sections, all of which have the same memory attributes.

ELF 文件可以用兩種方式解釋: 作爲一組段或一組節。節是 ELF 文件中包含非常特定信息 (如機器指令或符號表) 的較小片斷。段是一個或多個節的較大分組, 所有這些節具有相同的內存屬性。

Using an analogy of a car, the “sections” of the car would be the undeniable features of that car such as seats, the glove compartment, the gas petal, the steering wheel, the rear window, and the dash board controls. Regardless of how these are grouped, they exist and can be separated from the car if needed. Segments, on the other hand, are not as concrete or real but rather are more like a grouping of sections. For example, we could have front and back segments. The front segment would contain the steering wheel, the front seats, and so on. The back segment would contain the rear window, the back seat, etc. We could also split the car into left and right segments. Or we could create overlapping segments such as a front segment and a left segment. In fact, one segment could completely contain another segment. Regardless of how we group the “sections” of the car into segments, the sections remain the same. The location of the sections in the car is important, however; the car wouldn’t be very practical with the steering wheel in the back seat!

使用汽車的做比喻, 汽車的 "節" 將是汽車不可缺少的功能, 如座椅, 手套箱, 油門踏板, 方向盤, 後窗, 和儀表板控件。不管這些是如何組合的, 它們的確是存在的, 如果需要, 可以與汽車分離。另一方面, 段不是具體的或真實的, 而是更像是一個節的分組。例如, 我們可以有前後段。前段將包含方向盤, 前排座椅, 等等。後段將包含後窗, 後座等。我們還可以把車分成左右段。或者, 我們可以創建重疊段, 如前段和左段。事實上, 一個段可以完全包含另一個段。無論我們如何把汽車的 "節" 分成段, 各節保持不變。然而, 在汽車中節的位置是重要的;這輛車在後座的方向盤上不會很實用!

The grouping of sections into segments for executable foo is shown in the following command:

下面的命令顯示了將節分組到可執行 foo 的段中:

Code View: Scroll / Show All

penguin> readelf -l foo

 

Elf file type is EXEC (Executable file)

Entry point 0x8048540

There are 7 program headers, starting at offset 52

 

Program Headers:

  Type     Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align

  PHDR     0x000034 0x08048034 0x08048034 0x000e0 0x000e0 R E 0x4

  INTERP   0x000114 0x08048114 0x08048114 0x00013 0x00013 R   0x1

      [Requesting program interpreter: /lib/ld-linux.so.2]

  LOAD     0x000000 0x08048000 0x08048000 0x0076c 0x0076c R E 0x1000

  LOAD     0x00076c 0x0804976c 0x0804976c 0x001d4 0x001dc RW  0x1000

  DYNAMIC  0x000810 0x08049810 0x08049810 0x000f0 0x000f0 RW  0x4

  NOTE     0x000128 0x08048128 0x08048128 0x00020 0x00020 R   0x4

  GNU_EH_FRAME 0x000748 0x08048748 0x08048748 0x00024 0x00024 R 0x4

 

Section to Segment mapping:

 Segment Sections...

 00

 01      .interp

 02      .interp .note.ABI-tag .hash .dynsym .dynstr .gnu.version .gnu.version_r  .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame_hdr

 03      .data .eh_frame .dynamic .ctors .dtors .jcr .got .bss

 04      .dynamic

 05      .note.ABI-tag

 06      .eh_frame_hdr

 

The second part of the output shows which sections are contained in which segments. Notice that the “.interp” section is contained by both segment 1 and segment 2. The first part of the output, “Program Headers,” will be explained in more detail next.

輸出的第二部分顯示哪些節包含在哪些段中。請注意, ". interp" 節包含在段1和段2中。輸出的第一部分 "程序頭" 將在下文中詳細說明。

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章