9.9. Program Interpreter
The term “program interpreter” comes from the ELF standard. On Linux, the program interpreter is ld.so (/lib/ld-linux.so), the run time linker/loader. The program interpreter is responsible for bringing up an executable and getting it running. It is called by the kernel and is passed a special array of information called an auxiliary vector. This is shown as follows using the special environment variable LD_SHOW_AUXV:
術語“程序解釋器”來自ELF標準。在Linux上,程序解釋器是ld.so(/lib/ld-linux.so),即運行時鏈接器/加載器。程序解釋器負責提出可執行文件並使其運行。它由內核調用,並傳遞一個稱爲輔助向量的特殊信息數組。使用特殊環境變量LD_SHOW_AUXV,如下所示:
penguin> export LD_SHOW_AUXV=true
penguin> foo
AT_HWCAP: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 19 21 22 mmx osfxsr xmm xmm2 27 28 29
AT_PAGESZ: 4096
AT_CLKTCK: 100
AT_PHDR: 0x8048034
AT_PHENT: 32
AT_PHNUM: 7
AT_BASE: 0x40000000
AT_FLAGS: 0x0
AT_ENTRY: 0x8048540
AT_UID: 7903
AT_EUID: 7903
AT_GID: 200
AT_EGID: 200
AT_PLATFORM: i686
This is a printf format string in baz
This is a printf format string in main
Brief definitions of the various fields can be found from /usr/åinclude/elf.h:
可以在/usr/include/elf.h中找到各個字段的簡要定義:
Code View: Scroll / Show All
/* Legal values for a_type (entry type). */
#define AT_NULL 0 /* End of vector */
#define AT_IGNORE 1 /* Entry should be ignored */
#define AT_EXECFD 2 /* File descriptor of program */
#define AT_PHDR 3 /* Program headers for program */
#define AT_PHENT 4 /* Size of program header entry */
#define AT_PHNUM 5 /* Number of program headers */
#define AT_PAGESZ 6 /* System page size */
#define AT_BASE 7 /* Base address of interpreter */
#define AT_FLAGS 8 /* Flags */
#define AT_ENTRY 9 /* Entry point of program */
#define AT_NOTELF 10 /* Program is not ELF */
#define AT_UID 11 /* Real uid */
#define AT_EUID 12 /* Effective uid */
#define AT_GID 13 /* Real gid */
#define AT_EGID 14 /* Effective gid */
#define AT_CLKTCK 17 /* Frequency of times() */
/* Some more special a_type values describing the hardware. */
#define AT_PLATFORM 15 /* String identifying platform. */
#define AT_HWCAP 16 /* Machine dependent hints about processor capabilities. */
Here is some additional information about the various fields:
以下是有關各個字段的其他信息:
The kernel on some platforms that support ELF may choose not to load the program but instead pass an open file descriptor to the run time loader/linker so that it can load the program on its own. In this case, the auxiliary vector will include another field called AT_EXECFD.
某些支持ELF的平臺上的內核可能選擇不加載程序,而是將打開的文件描述符傳遞給運行時加載器/鏈接器,以便它可以自己加載程序。在這種情況下,輔助矢量將包括另一個稱爲AT_EXECFD的字段。
It is the run time loader/linker’s responsibility to load up the program if needed and perform all initialization. The initialization includes finding all required libraries, calling initialization functions, performing required relocations, and so on. However, before it initializes the program, it needs first to initialize itself. This is actually a fairly complex process that is beyond the scope of this chapter. The reason for its complexity is that the run time linker has to do this manually because the regular methods rely on some basic setup that does not exist when the run time linker starts.
The run time linker/loader ld.so also has a special environment variable to help debug it. This environment variable instructs ld.so to show all the main activity while it brings up a program. In other words, it is like a trace of the run time linker/loader. Here is an example of this special debug mode in action:
運行時加載器/鏈接器負責在需要時加載程序並執行所有初始化。初始化包括查找所有必需的庫,調用初始化函數,執行所需的重定位等。但是,在初始化程序之前,首先需要初始化自己。這實際上是一個相當複雜的過程,超出了本章的範圍。其複雜性的原因是運行時鏈接器必須手動執行此操作,因爲常規方法依賴於運行時鏈接器啓動時不存在的一些基本設置。
運行時鏈接器/加載器ld.so還有一個特殊的環境變量來幫助調試它。此環境變量指示ld.so在顯示程序時顯示所有主要活動。換句話說,它就像運行時鏈接器/加載器的跟蹤。以下是此特殊調試模式的示例:
Code View: Scroll / Show All
penguin> export LD_DEBUG=all
penguin> foo
27080:
27080: file=libfoo.so; needed by foo
27080: find library=libfoo.so; searching
27080: search path=./i686/mmx:./i686:./mmx:. (RPATH from file foo)
27080: trying file=./i686/mmx/libfoo.so
27080: trying file=./i686/libfoo.so
27080: trying file=./mmx/libfoo.so
27080: trying file=./libfoo.so
27080:
27080: file=libfoo.so; generating link map
27080: dynamic: 0x40015c48 base: 0x40014000 size: 0x00001d84
27080: entry: 0x400147e0 phdr: 0x40014034 phnum: 4
27080:
27080:
27080: file=libstdc++.so.5; needed by foo
27080: find library=libstdc++.so.5; searching
27080: search path=./i686/mmx:./i686:./mmx:. (RPATH from file foo)
27080: trying file=./i686/mmx/libstdc++.so.5
27080: trying file=./i686/libstdc++.so.5
27080: trying file=./mmx/libstdc++.so.5
27080: trying file=./libstdc++.so.5
27080: search path=/usr/lib/i686/mmx:/usr/lib/i686:/usr/lib/mmx:/usr/lib (system search path)
27080: trying file=/usr/lib/i686/mmx/libstdc++.so.5
27080: trying file=/usr/lib/i686/libstdc++.so.5
27080: trying file=/usr/lib/mmx/libstdc++.so.5
27080: trying file=/usr/lib/libstdc++.so.5
27080:
27080: file=libstdc++.so.5; generating link map
27080: dynamic: 0x400c246c base: 0x40016000 size: 0x000b23c0
27080: entry: 0x40050700 phdr: 0x40016034 phnum: 4
<...>
This first part is called “loading” and involves finding and loading all required shared libraries. The search path (LD_LIBRARY_PATH) and the RPATH are searched as potential directories to find libraries. Make note of the following text in the output “generating link map.” This is described in more detail shortly. Let’s see what else is in this debug output:
第一部分稱爲“加載”,涉及查找和加載所有必需的共享庫。搜索路徑(LD_LIBRARY_PATH)和RPATH作爲潛在目錄進行搜索以查找庫。在輸出“生成鏈接映射”中記下以下文本。稍後將對此進行更詳細的描述。讓我們看看這個調試輸出中還有什麼:
<...>
27080:
27080: calling init: ./libfoo.so
27080:
<...>
Here we see the init function in libfoo.so being called. This is before control has officially been handed over to the executable. The output continues...
這裏我們看到libfoo.so中的init函數被調用。 這是在控制權正式移交給可執行文件之前。 輸出繼續......
<...>
27080:
27080: initialize program: foo
27080:
27080:
27080: transferring control: foo
27080:
<...>
This is where control is officially handed over to the executable foo. After this point, the contents of the debug output are for “late” or “lazy” binding:
這是控制權正式移交給可執行文件foo的地方。 在此之後,調試輸出的內容用於“延遲”或“延遲”綁定:
Code View: Scroll / Show All
<...>
27080: symbol=_Z3bazi; lookup in file=foo
27080: symbol=_Z3bazi; lookup in file=./libfoo.so
27080: binding file foo to ./libfoo.so: normal symbol '_Z3bazi'
27080: symbol=_Z3fooi; lookup in file=foo
27080: symbol=_Z3fooi; lookup in file=./libfoo.so
27080: symbol=_Z3fooi; lookup in file=/usr/lib/libstdc++.so.5
27080: symbol=_Z3fooi; lookup in file=/lib/libm.so.6
27080: symbol=_Z3fooi; lookup in file=/lib/libgcc_s.so.1
27080: symbol=_Z3fooi; lookup in file=/lib/libc.so.6
27080: symbol=_Z3fooi; lookup in file=/lib/ld-linux.so.2
27080: binding file ./libfoo.so to ./libfoo.so: normal symbol '_Z3fooi'
27080: symbol=printf; lookup in file=foo
27080: symbol=printf; lookup in file=./libfoo.so
27080: symbol=printf; lookup in file=/usr/lib/libstdc++.so.5
27080: symbol=printf; lookup in file=/lib/libm.so.6
27080: symbol=printf; lookup in file=/lib/libgcc_s.so.1
27080: symbol=printf; lookup in file=/lib/libc.so.6
27080: binding file ./libfoo.so to /lib/libc.so.6: normal symbol 'printf' [GLIBC_2.0]
<...>
These binding actions are driven by the _dl_runtime_resolve function described back in the “.plt” section of this chapter.
這些綁定操作由本章“.plt”部分中描述的_dl_runtime_resolve函數驅動。
Remember that text, “generating link map,” from the output from this special debug mode? A link map contains information about a shared library that has been loaded into the address space.
還記得這個特殊調試模式的輸出文本“生成鏈接映射”嗎? 鏈接映射包含有關已加載到地址空間的共享庫的信息。
There is a special variable called _dl_main_searchlist that has the following structure:
有一個名爲_dl_main_searchlist的特殊變量,它具有以下結構:
struct
{
/* Array of maps for the scope. */
struct link_map **r_list;
/* Number of entries in the scope. */
unsigned int r_nlist;
};
From within GDB (the process has to be running for this to be useful), we can see the values of the two structure members:
從GDB內部(必須運行該進程纔有用),我們可以看到兩個結構成員的值:
(gdb) x/2x _dl_main_searchlist
0x400130c8: 0x40223030 0x00000007
The first value is the address of the list, and the second value is the number of elements in the list. Looking at the seven values in memory, we get the following:
第一個值是列表的地址,第二個值是列表中的元素數。 查看內存中的七個值,我們得到以下結果:
(gdb) x/7 0x40223030
0x40223030: 0x40012fd0 0x40013590 0x400137f8 0x400139e8
0x40223040: 0x40013bd0 0x40013dc0 0x40012d80
Each of these values is a pointer to the following structure:
每個值都是指向以下結構的指針:
struct link_map
{
/* These first few members are part of the protocol with the
debugger.
This is the same format used in SVR4. */
ElfW(Addr) l_addr; /* Base address shared object is loaded at.*/
char *l_name; /* Absolute file name object was found in. */
ElfW(Dyn) *l_ld; /* Dynamic section of the shared object. */
struct link_map *l_next, *l_prev; /* Chain of loaded objects.*/
};
This is the link map that the ld.so output referred to. Let’s look at the second address in the list:
這是ld.so輸出引用的鏈接映射。 我們來看看列表中的第二個地址:
(gdb) x/5x 0x40013590
0x40013590: 0x40014000 0x40013580 0x40015c48 0x400137f8
0x400135a0: 0x40012fd0
According to the link_map structure, the second value should be the path of a loaded library, confirmed below:
根據link_map結構,第二個值應該是加載庫的路徑,在下面確認:
(gdb) x/s 0x40013580
0x40013580: "./libfoo.so"
The next link_map value is another library:
(gdb) x/5x 0x400137f8
0x400137f8: 0x40016000 0x400137e0 0x400c246c 0x400139e8
0x40013808: 0x40013590
(gdb) x/s 0x400137e0
0x400137e0: "/usr/lib/libstdc++.so.5"
Also notice that the fourth value in each link_map structure (l_next) points to the next link map structure. l_prev points to the previous structure. There is both a linked list and an array of pointers to these functions.
另請注意,每個link_map結構(l_next)中的第四個值指向下一個鏈接映射結構。 l_prev指向上一個結構。鏈接列表和指向這些函數的指針數組都有。
The list of loaded libraries is used by the run time linker to keep track of the loaded libraries for a process.
運行時鏈接程序使用已加載庫的列表來跟蹤進程的已加載庫。