9.9. Program Interpreter

9.9. Program Interpreter

The term “program interpreter” comes from the ELF standard. On Linux, the program interpreter is ld.so (/lib/ld-linux.so), the run time linker/loader. The program interpreter is responsible for bringing up an executable and getting it running. It is called by the kernel and is passed a special array of information called an auxiliary vector. This is shown as follows using the special environment variable LD_SHOW_AUXV:

術語“程序解釋器”來自ELF標準。在Linux上,程序解釋器是ld.so(/lib/ld-linux.so),即運行時鏈接器/加載器。程序解釋器負責提出可執行文件並使其運行。它由內核調用,並傳遞一個稱爲輔助向量的特殊信息數組。使用特殊環境變量LD_SHOW_AUXV,如下所示:

penguin> export LD_SHOW_AUXV=true

penguin> foo

AT_HWCAP:    fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge

mca cmov pat pse36 19 21 22 mmx osfxsr xmm xmm2 27 28 29

AT_PAGESZ:      4096

AT_CLKTCK:      100

AT_PHDR:        0x8048034

AT_PHENT:       32

AT_PHNUM:       7

AT_BASE:        0x40000000

AT_FLAGS:       0x0

AT_ENTRY:       0x8048540

AT_UID:         7903

AT_EUID:        7903

AT_GID:         200

AT_EGID:        200

AT_PLATFORM:    i686

This is a printf format string in baz

This is a printf format string in main

 

Brief definitions of the various fields can be found from /usr/åinclude/elf.h:

可以在/usr/include/elf.h中找到各個字段的簡要定義:

Code View: Scroll / Show All

/* Legal values for a_type (entry type). */

 

#define AT_NULL      0        /* End of vector */

#define AT_IGNORE    1        /* Entry should be ignored */

#define AT_EXECFD    2        /* File descriptor of program */

#define AT_PHDR      3        /* Program headers for program */

#define AT_PHENT     4        /* Size of program header entry */

#define AT_PHNUM     5        /* Number of program headers */

#define AT_PAGESZ    6        /* System page size */

#define AT_BASE      7        /* Base address of interpreter */

#define AT_FLAGS     8        /* Flags */

#define AT_ENTRY     9        /* Entry point of program */

#define AT_NOTELF   10        /* Program is not ELF */

#define AT_UID      11        /* Real uid */

#define AT_EUID     12        /* Effective uid */

#define AT_GID      13        /* Real gid */

#define AT_EGID     14        /* Effective gid */

#define AT_CLKTCK   17        /* Frequency of times() */

 

/* Some more special a_type values describing the hardware. */

#define AT_PLATFORM 15        /* String identifying platform. */

#define AT_HWCAP    16        /* Machine dependent hints about processor capabilities.  */

                             

 

Here is some additional information about the various fields:

以下是有關各個字段的其他信息:

AT_PAGESZ:

The standard page size used on this operating system for normal memory regions. Other memory regions (such as shared memory) can have larger page sizes. It is assumed that normal memory regions are used when loading ELF objects into memory.

AT_PHDR:

The address of the program header table for the executable.

AT_PHENT:

The size of an entry in the program header table.

AT_PHNUM:

The number of entries in the program header table. Note that with AT_PHDR, AT_PHENT and PHNUM, the program interpreter can find all of the loadable segments of an ELF object file.

AT_BASE:

The address of the program interpreter (/lib/ld-linux.so.2 in this case) itself.

AT_ENTRY:

Entry point of the program. This is the address of the execut able that the run time loader will hand control over to after finishing program initialization. This is usually the _start function.

AT_PLATFORM:

The current hardware platform.

AT_UID, AT_EUID, AT_GID, AT_EGID:

user ID, effective user ID, group ID and effective group ID respectively.

 

The kernel on some platforms that support ELF may choose not to load the program but instead pass an open file descriptor to the run time loader/linker so that it can load the program on its own. In this case, the auxiliary vector will include another field called AT_EXECFD.

某些支持ELF的平臺上的內核可能選擇不加載程序,而是將打開的文件描述符傳遞給運行時加載器/鏈接器,以便它可以自己加載程序。在這種情況下,輔助矢量將包括另一個稱爲AT_EXECFD的字段。

It is the run time loader/linker’s responsibility to load up the program if needed and perform all initialization. The initialization includes finding all required libraries, calling initialization functions, performing required relocations, and so on. However, before it initializes the program, it needs first to initialize itself. This is actually a fairly complex process that is beyond the scope of this chapter. The reason for its complexity is that the run time linker has to do this manually because the regular methods rely on some basic setup that does not exist when the run time linker starts.

The run time linker/loader ld.so also has a special environment variable to help debug it. This environment variable instructs ld.so to show all the main activity while it brings up a program. In other words, it is like a trace of the run time linker/loader. Here is an example of this special debug mode in action:

運行時加載器/鏈接器負責在需要時加載程序並執行所有初始化。初始化包括查找所有必需的庫,調用初始化函數,執行所需的重定位等。但是,在初始化程序之前,首先需要初始化自己。這實際上是一個相當複雜的過程,超出了本章的範圍。其複雜性的原因是運行時鏈接器必須手動執行此操作,因爲常規方法依賴於運行時鏈接器啓動時不存在的一些基本設置。

運行時鏈接器/加載器ld.so還有一個特殊的環境變量來幫助調試它。此環境變量指示ld.so在顯示程序時顯示所有主要活動。換句話說,它就像運行時鏈接器/加載器的跟蹤。以下是此特殊調試模式的示例:

Code View: Scroll / Show All

penguin> export LD_DEBUG=all

penguin> foo

27080:

27080:  file=libfoo.so; needed by foo

27080:  find library=libfoo.so; searching

27080:   search path=./i686/mmx:./i686:./mmx:. (RPATH from file foo)

27080:    trying file=./i686/mmx/libfoo.so

27080:    trying file=./i686/libfoo.so

27080:    trying file=./mmx/libfoo.so

27080:    trying file=./libfoo.so

27080:

27080:  file=libfoo.so; generating link map

27080:    dynamic: 0x40015c48 base: 0x40014000   size: 0x00001d84

27080:      entry: 0x400147e0 phdr: 0x40014034  phnum:          4

27080:

27080:

27080:  file=libstdc++.so.5; needed by foo

27080:  find library=libstdc++.so.5; searching

27080:   search path=./i686/mmx:./i686:./mmx:. (RPATH from file foo)

27080:    trying file=./i686/mmx/libstdc++.so.5

27080:    trying file=./i686/libstdc++.so.5

27080:    trying file=./mmx/libstdc++.so.5

27080:    trying file=./libstdc++.so.5

27080:   search path=/usr/lib/i686/mmx:/usr/lib/i686:/usr/lib/mmx:/usr/lib (system search path)

27080:    trying file=/usr/lib/i686/mmx/libstdc++.so.5

27080:    trying file=/usr/lib/i686/libstdc++.so.5

27080:    trying file=/usr/lib/mmx/libstdc++.so.5

27080:    trying file=/usr/lib/libstdc++.so.5

27080:

27080:  file=libstdc++.so.5; generating link map

27080:    dynamic: 0x400c246c base: 0x40016000   size: 0x000b23c0

27080:      entry: 0x40050700 phdr: 0x40016034  phnum:  4

<...>

 

This first part is called “loading” and involves finding and loading all required shared libraries. The search path (LD_LIBRARY_PATH) and the RPATH are searched as potential directories to find libraries. Make note of the following text in the output “generating link map.” This is described in more detail shortly. Let’s see what else is in this debug output:

第一部分稱爲“加載”,涉及查找和加載所有必需的共享庫。搜索路徑(LD_LIBRARY_PATH)和RPATH作爲潛在目錄進行搜索以查找庫。在輸出“生成鏈接映射”中記下以下文本。稍後將對此進行更詳細的描述。讓我們看看這個調試輸出中還有什麼:

<...>

27080:

27080: calling init: ./libfoo.so

27080:

<...>

 

Here we see the init function in libfoo.so being called. This is before control has officially been handed over to the executable. The output continues...

這裏我們看到libfoo.so中的init函數被調用。 這是在控制權正式移交給可執行文件之前。 輸出繼續......

<...>

27080:

27080: initialize program: foo

27080:

27080:

27080: transferring control: foo

27080:

<...>

 

This is where control is officially handed over to the executable foo. After this point, the contents of the debug output are for “late” or “lazy” binding:

這是控制權正式移交給可執行文件foo的地方。 在此之後,調試輸出的內容用於“延遲”或“延遲”綁定:

Code View: Scroll / Show All

<...>

27080:  symbol=_Z3bazi;  lookup in file=foo

27080:  symbol=_Z3bazi;  lookup in file=./libfoo.so

27080:  binding file foo to ./libfoo.so: normal symbol '_Z3bazi'

27080:  symbol=_Z3fooi;  lookup in file=foo

27080:  symbol=_Z3fooi;  lookup in file=./libfoo.so

27080:  symbol=_Z3fooi;  lookup in file=/usr/lib/libstdc++.so.5

27080:  symbol=_Z3fooi;  lookup in file=/lib/libm.so.6

27080:  symbol=_Z3fooi;  lookup in file=/lib/libgcc_s.so.1

27080:  symbol=_Z3fooi;  lookup in file=/lib/libc.so.6

27080:  symbol=_Z3fooi;  lookup in file=/lib/ld-linux.so.2

27080: binding file ./libfoo.so to ./libfoo.so: normal symbol '_Z3fooi'

27080:  symbol=printf;  lookup in file=foo

27080:  symbol=printf;  lookup in file=./libfoo.so

27080:  symbol=printf;  lookup in file=/usr/lib/libstdc++.so.5

27080:  symbol=printf;  lookup in file=/lib/libm.so.6

27080:  symbol=printf;  lookup in file=/lib/libgcc_s.so.1

27080:  symbol=printf;  lookup in file=/lib/libc.so.6

27080:  binding file ./libfoo.so to /lib/libc.so.6: normal symbol 'printf' [GLIBC_2.0]

<...>

 

These binding actions are driven by the _dl_runtime_resolve function described back in the “.plt” section of this chapter.

這些綁定操作由本章“.plt”部分中描述的_dl_runtime_resolve函數驅動。

9.9.1. Link Map

Remember that text, “generating link map,” from the output from this special debug mode? A link map contains information about a shared library that has been loaded into the address space.

還記得這個特殊調試模式的輸出文本“生成鏈接映射”嗎? 鏈接映射包含有關已加載到地址空間的共享庫的信息。

There is a special variable called _dl_main_searchlist that has the following structure:

有一個名爲_dl_main_searchlist的特殊變量,它具有以下結構:

struct

{

  /* Array of maps for the scope. */

  struct link_map **r_list;

  /* Number of entries in the scope. */

  unsigned int r_nlist;

};

 

From within GDB (the process has to be running for this to be useful), we can see the values of the two structure members:

從GDB內部(必須運行該進程纔有用),我們可以看到兩個結構成員的值:

(gdb) x/2x _dl_main_searchlist

0x400130c8:     0x40223030      0x00000007

 

The first value is the address of the list, and the second value is the number of elements in the list. Looking at the seven values in memory, we get the following:

第一個值是列表的地址,第二個值是列表中的元素數。 查看內存中的七個值,我們得到以下結果:

(gdb) x/7 0x40223030

0x40223030:    0x40012fd0    0x40013590    0x400137f8    0x400139e8

0x40223040:    0x40013bd0    0x40013dc0    0x40012d80

 

Each of these values is a pointer to the following structure:

每個值都是指向以下結構的指針:

struct link_map

  {

    /* These first few members are part of the protocol with the

debugger.

       This is the same format used in SVR4.  */

 

    ElfW(Addr) l_addr; /* Base address shared object is loaded at.*/

    char *l_name;      /* Absolute file name object was found in. */

    ElfW(Dyn) *l_ld;   /* Dynamic section of the shared object.   */

    struct link_map *l_next, *l_prev; /* Chain of loaded objects.*/

};

 

This is the link map that the ld.so output referred to. Let’s look at the second address in the list:

這是ld.so輸出引用的鏈接映射。 我們來看看列表中的第二個地址:

(gdb) x/5x 0x40013590

0x40013590:  0x40014000   0x40013580   0x40015c48   0x400137f8

0x400135a0:  0x40012fd0

 

According to the link_map structure, the second value should be the path of a loaded library, confirmed below:

根據link_map結構,第二個值應該是加載庫的路徑,在下面確認:

(gdb) x/s 0x40013580

0x40013580:      "./libfoo.so"

 

The next link_map value is another library:

(gdb) x/5x 0x400137f8

0x400137f8:   0x40016000   0x400137e0   0x400c246c  0x400139e8

0x40013808:   0x40013590

(gdb) x/s 0x400137e0

0x400137e0:   "/usr/lib/libstdc++.so.5"

 

Also notice that the fourth value in each link_map structure (l_next) points to the next link map structure. l_prev points to the previous structure. There is both a linked list and an array of pointers to these functions.

另請注意,每個link_map結構(l_next)中的第四個值指向下一個鏈接映射結構。 l_prev指向上一個結構。鏈接列表和指向這些函數的指針數組都有。

The list of loaded libraries is used by the run time linker to keep track of the loaded libraries for a process.

運行時鏈接程序使用已加載庫的列表來跟蹤進程的已加載庫。

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章