6.5. Examining Data, Memory, and Registers

6.5. Examining Data, Memory, and Registers

At this point, we’ve covered all of the ways to get a process (or a memory image of a process) into GDB. In this section, we discuss how to examine data, memory, and registers. This section introduces important GDB commands that work regardless of whether the process is debugging a live process or whether you’re performing post-mortem with a core file. When there is a difference, there will be a note to indicate it.

在这一点上, 我们已经介绍了所有的方法来获取一个进程 (或一个进程的内存映像) 到 GDB。在本节中, 我们将讨论如何检查数据、内存和寄存器。本节介绍了重要的 GDB 命令, 不管该进程是调试实时进程还是使用核心文件。当有差异时, 将会有一个注释来表示它。

6.5.1. Memory Map

Strictly speaking, viewing the memory map for a process is not part of GDB but is still very important to understand, which is why it is covered briefly here in the context of GDB and in more detail in Chapter 3, “The /proc Filesystem.” There must be a live process for there to be a corresponding “maps” file under /proc. You cannot get a memory map using /proc for a process image that is loaded in GDB as a core file.

严格地说, 查看进程的内存映射不是 GDB 的一部分, 但仍然是非常重要的, 这就是为什么它在 GDB 的上下文中简要介绍, 以及在3章 "/proc文件系统" 中更详细地介绍。一个实时的进程, 在/proc有一个对应的 "maps" 文件。使用/proc 内存映射, 您无法获得可以在GDB 中加载的获得core文件。

The memory map is the list of memory segments (aka regions) that a process has in its address space. There is a memory segment for every type of memory that a process is using, including the process heap, the process stack, memory that stores the contents of an executable, memory for the shared libraries, and so on. Memory segments also have different attributes such as read, write, and execute. These attributes will depend on the purpose of the memory segment. Shared libraries, for example, will have a large read/execute segment that cannot be written to. This is for the machine code (the actual code that gets run) in the shared library.

内存映射是进程在其地址空间中具有的内存段 (又称区域) 的列表。对于进程所使用的每种类型的内存, 都有一个内存段, 包括进程堆、进程栈、存储可执行文件内容的内存、共享库的内存等。内存段还具有不同的属性, 如读取、写入和执行。这些属性将取决于内存段的用途。例如, 共享库将具有一个无法写入的巨大的读取/执行段。这是用于共享库中的机器代码 (实际运行的代码)。

The memory map is important for a few reasons: 内存映射很重要, 原因有以下几个:

  • It tells you which shared libraries are loaded and at which addresses.
  • It tells you where each memory segment is. Memory accessed outside of the valid memory segments will cause a segmentation violation.
  • The memory map will tell you a bit more about an address. If the address is in the memory map for a shared library, it is probably a global variable or function in that shared library.
  • You can tell if the heap or stack collided with another memory segment (for example, there is no space between the heap or stack and the next segment).

The best way to look at the memory map for a live process in GDB is to use a shell escape, making use of GDB’s info program and shell commands (the latter is to make direct calls to programs outside of GDB):

在 gdb 中查看实时进程的内存映射的最好方法是使用 shell 转义, 利用 gdb 的信息程序和 shell 命令 (后者直接调用 gdb 外部的程序):

Code View: Scroll / Show All

(gdb) info program

    Using the running image of attached process 11702.

Program stopped at 0x4019fd01.

It stopped with signal SIGSTOP, Stopped (signal).

(gdb) shell cat /proc/11702/maps

08048000-08049000  r-xp 00000000 08:13 3647282  /home/wilding/src/Linuxbook/hang2

08049000-0804a000  rw-p 00000000 08:13 3647282  /home/wilding/src/Linuxbook/hang2

40000000-40012000 r-xp 00000000 08:13 1144740  /lib/ld-2.2.5.so

40012000-40013000 rw-p 00011000 08:13 1144740  /lib/ld-2.2.5.so

40013000-40014000 rw-p 00000000 00:00 0

40014000-400ad000 r-xp 00000000 08:13 1847971  /usr/lib/libstdc++.so.5.0.0

400ad000-400c2000 rw-p 00098000 08:13 1847971  /usr/lib/libstdc++.so.5.0.0

400c2000-400c7000 rw-p 00000000 00:00 0

400d7000-400f9000 r-xp 00000000 08:13 1144751  /lib/libm.so.6

400f9000-400fa000 rw-p 00021000 08:13 1144751  /lib/libm.so.6

400fa000-40101000 r-xp 00000000 08:13 1144783  /lib/libgcc_s.so.1

40101000-40102000 rw-p 00007000 08:13 1144783  /lib/libgcc_s.so.1

40102000-40216000 r-xp 00000000 08:13 1144746  /lib/libc.so.6

40216000-4021c000 rw-p 00113000 08:13 1144746  /lib/libc.so.6

4021c000-40221000 rw-p 00000000 00:00 0

bfffe000-c0000000 rwxp fffff000 00:00 0

 

For more information on a process’ address space and the various mappings, refer to the /proc/<pid>/maps section in Chapter 3.

有关进程的地址空间和各种映射的详细信息, 请参阅3章中的/proc/<pid>/maps部分.

6.5.2. Stack

The stack is important because it contains information about “where” in the code a process is running. The stack contains one stack “frame” for each unfinished function that called another function. This leads to a hierarchical chain or stack of function callers and callees. The functions on the stack have not finished—in other words, each function will continue if the function they called finishes. A “stack trace” or “back trace” is the list of functions in the stack. In GDB, the backtrace or bt command will dump the stack trace for the process currently being debugged:

栈很重要, 因为它包含正在运行的进程运行到代码哪里的信息。对于调用另一个函数的每个未完成函数, 栈包含一个栈 "帧"。这将导致一个分层链或调用函数和被调用函数的栈。栈上的函数尚未完成-换言之, 如果调用的函数完成, 则函数将继续。"栈跟踪" 或 "后退跟踪" 是栈中函数的列表。在 GDB 中, backtrace或 bt 命令将转储当前正在调试的进程的栈跟踪:

Code View: Scroll / Show All

(gdb) backtrace

#0 0x400d6f0b in pause () from /lib/i686/libc.so.6

#1 0x080483d0 in function4 (a=97 'a') at gdb_stack.c:9

#2 0x080483e9 in function3 (string=0xbffff340 "This is a local string")at gdb_stack.c:16

#3 0x0804843a in function2 (param=3) at gdb_stack.c:23

#4 0x08048456 in function1 (param=3) at gdb_stack.c:31

#5 0x0804847d in main () at gdb_stack.c:38

 

From this stack trace (a.k.a. back trace), we know that the main() function called function1(), and that function1() called function2(), and so forth. The last function on the stack is pause(). If pause finishes (that is, exits), function4 will continue. If function4 finishes, function3 will continue, and so on.

从这个栈跟踪 (又称backtrace) 中, 我们知道main () 函数调用 function1 (), 而 function1 () 调用 function2 (), 依此类推。堆栈上的最后一个函数是pause ()。如果pause完成 (即退出), function4 将继续。如果 function4 完成, function3 将继续, 等等。

Note that in this output the arguments, filename, and line number for all functions except pause() are shown. This is because gdb_stack.c (a little program specifically for this example) was compiled with -g to include debug symbols, but the source used to create libc.so.6, where pause is contained, was not. If the program was built without -g, we would only know the address of the program counter for each stack frame and not the line of code.

请注意, 在此输出中显示除pause () 以外的所有函数的参数、文件名和行号。这是因为 gdb_stack (专门为本例编写的一个小程序) 是用 -g 编译的, 包括调试符号。 但用于创建 libc.so.6的源代码(其中包含pause)是没有-g编译的。如果程序是在没有 -g 的情况下构建的, 我们只知道每个栈帧的程序计数器的地址, 而不是代码行。

Note: Some distributions strip their shared libraries, removing the main symbol table and other information. However, there is usually a non-stripped shared library (such as libc6-dbg.so) that contains the main symbol table and additional information useful for debugging.

注意: 一些发行版剥离了它们的共享库, 删除了主符号表和其他信息。但是, 通常有一个非剥离的共享库 (如 libc6-dbg.so), 其中包含主符号表和用于调试的附加信息。

 

The function names themselves aren’t stored in each stack frame. The function names are too long, and the reality is that function names aren’t much use to the computer. Instead each stack frame contains the saved instruction pointer or program counter. The saved program counter can be translated into the function name by looking at the address of the program counter and the instruction of the library or executable that is loaded in that region of memory. The full set of steps to translate a program counter address into a line of code can be found in Chapter 4, which contains detailed information about compiling programs. More information on stack traces can be found in Chapter 5.

函数名称本身并不存储在每个栈帧中。函数名太长, 而且函数名对计算机没有太大用处。相反, 每个栈帧都包含已保存的指令指针或程序计数器。通过查看程序计数器的地址以及加载在内存区域中的库或可执行文件的指令, 可以将保存的程序计数器转换为函数名。将程序计数器地址转换成一行代码的全部步骤可以在第4章中找到, 其中包含有关编译程序的详细信息。有关栈跟踪的更多信息, 可以在第5章中找到。

Let’s go back to the stack trace output to explain the format. The numbers on the left indicate the frame number for each stack frame. These numbers can be used with various GDB frame-related commands to reference a specific frame. The next column in the stack trace output is a hexadecimal address of the program counter stored for each stack frame. On x86-based hardware, shared libraries usually get loaded around address 0x40000000, and the executable gets mapped in at 0x08048000. See the /proc/<pid>/maps section in the /proc filesystem chapter (Chapter 3) for more information on address space mappings. It is good enough for this discussion to know that the program counter address of 0x400d6f0b for the function pause makes sense because it is found in a shared library and is near the starting address for shared libraries of 0x40000000. The program counter addresses starting with 0x08048 for the other functions also makes sense because it is part of the executable created from the gdb_stack.c source code.

让我们回到栈跟踪输出来解释格式。左边的数字表示每个栈帧的帧数。这些数字可以与各种 GDB 帧相关的命令一起使用来引用特定的帧。栈跟踪输出中的下一列是为每个栈帧存储的程序计数器的十六进制地址。在 x86硬件上, 共享库通常在地址0x40000000 上加载, 可执行文件在0x08048000 中被映射。有关地址空间映射的更多信息, 请参见/proc文件系统章节 (第3 章) 中的/prco/<pid>/maps部分。通过讨论知道0x400d6f0b 的程序计数器地址的函数pause是有意义的, 因为它是在共享库中找到的, 并接近0x40000000 共享库的起始地址。对于其他函数, 从0x08048 开始的程序计数器地址也很有意义, 因为它是从 gdb_stack.c 创建的可执行文件的一部分.

Use the bt full command to see more information in the stack trace including a dumping of local variables for each frame:

使用 bt 完整命令可以查看栈跟踪中的更多信息, 包括为每个帧的局部变量:

(gdb) bt full

#0  0x400d6f0b in pause () from /lib/i686/libc.so.6

No symbol table info available.

#1  0x080483d0 in function4 (a=97 'a') at gdb_stack.c:9

       b = 1

#2  0x080483e9 in function3 (string=0xbffff340 "This is a local string")

   at gdb_stack.c:16

       a = 97 'a'

#3  0x0804843a in function2 (param=3) at gdb_stack.c:23

       string = "This is a local string"

#4  0x08048456 in function1 (param=3) at gdb_stack.c:31

       localVar = 99

#5  0x0804847d in main () at gdb_stack.c:38

       stackVar = 3

(gdb)

 

This will include the local variables for each stack frame that has debug information. Usually, though, you’ll want to see specific local variables and just print them as needed. See the following section on printing values and variables.

这将包括每个具有调试信息的栈帧的局部变量。通常, 您需要查看特定的局部变量, 然后根据需要打印它们。有关打印值和变量,请参见下一节。

6.5.2.1. Navigating Stack Frames

GDB can only see local variables that belong to the current frame (that is, in the “scope” of the current function). If you want to view a local variable that is part of another function in the stack, you must tell GDB to switch its focus to that frame. You may also want to change the current stack frame (that is, function) to perform other operations in that scope. For example, you can tell GDB to “finish” a function, and GDB will run the process until the current function (at the current stack frame) finishes and returns control to the function that called it.

GDB 只能看到属于当前帧的局部变量 (即当前函数的 "范围")。如果要查看作为栈中另一个函数的一部分的局部变量, 则必须告诉 GDB 将其焦点切换到该帧。您可能还希望更改当前栈帧 (即函数), 以便在该范围内执行其他操作。例如, 您可以告诉 gdb "完成" 一个函数, gdb 将运行该进程, 直到当前函数 (在当前栈帧) 完成, 并将控制权返回给调用它的函数。

Unless any previous frame navigation has been performed, you will always be in frame #0 to start. This is always the “top” frame on the stack. The quickest way to switch stack frames is to use the frame command with the specific frame number.

除非以前处理过其它帧, 否则您将始终处于帧 #0 开始。这始终是栈上的 "顶部" 帧。切换栈帧的最快方法是使用具有特定帧数的帧命令。

(gdb) frame 3

#3  0x0804843a in function2 (param=3) at gdb_stack.c:23

23         function3( string );

(gdb)

 

You can also use the up and down commands to walk up and down frames in the stack:

也可以使用up和down命令在栈中向上和向下遍历帧:

(gdb) up

#1 0x080483d0 in function4 (a=97 'a') at gdb_stack.c:9

9     pause();

(gdb) down

#0 0x400d6f0b in pause () from /lib/i686/libc.so.6

(gdb) down

Bottom (i.e., innermost) frame selected; you cannot go down.

(gdb) up

#1 0x080483d0 in function4 (a=97 'a') at gdb_stack.c:9

9     pause();

(gdb) up

#2 0x080483e9 in function3 (string=0xbffff340 "This is a local string")

  at gdb_stack.c:16

16     function4( a );

(gdb) up

#3 0x0804843a in function2 (param=3) at gdb_stack.c:23

23     function3( string );

(gdb)

 

GDB won’t let you go past the beginning or the end of the stack, so you can use up and down without concern.

GDB 不会让你越过栈的范围, 所以你可以使用up/down而不关心越界。

Note: Stacks grow downward toward smaller addresses on x86-based hardware, so the “bottom” of the stack will have the highest stack frame address. The top of the stack is the stack frame in GDB that has a frame number of 0 (zero) and will have the lowest numbered address in memory. Please also note that there is a diagram of stack traces in Chapter 5.

注意: 栈向下扩展到 x86硬件上较小的地址, 因此栈的 "底部" 将具有最高的栈帧地址。栈的顶部是 GDB 中的栈帧, 它的帧数为 0 (零), 并将在内存中具有最低编号的地址。还请注意, 在第5章中有一个栈跟踪图。

6.5.2.2. Obtaining and Understanding Frame Information

Sometimes you’ll want/need to get more information about a stack frame. For example, a stack frame contains information about function arguments, local variables, and some interesting registers. This can be particularly useful if you don’t have the source code for the application. To get more information on a particular stack frame, use the info frame command:

有时, 您需要获取有关栈帧的更多信息。例如, 栈帧包含有关函数参数、局部变量和一些有趣的寄存器的信息。如果没有应用程序的源代码, 这可能特别有用。要获取有关特定堆栈帧的更多信息, 请使用 "info frame" 命令:

(gdb) info frame 2

Stack frame at 0xbffff330:

 eip = 0x80483e9 in function3 (gdb_stack.c:16); saved eip 0x804843a

 called by frame at 0xbffff370, caller of frame at 0xbffff310

 source language c.

 Arglist at 0xbffff328, args: string=0xbffff340 "This is a local string"

 Locals at 0xbffff328, Previous frame's sp is 0xbffff330

 Saved registers:

 ebp at 0xbffff328, eip at 0xbffff32c

(gdb)

 

There’s a lot of information here, so let’s break it down a little.

这里有很多信息, 让我们把它分解一下。

Stack frame at 0xbffff330:

 

This is simply the address of the stack frame.

这只是栈帧的地址。

eip = 0x80483e9 in function3 (gdb_stack.c:16); saved eip 0x804843a

 

The eip (Extended Instruction Pointer) address points to the next instruction to be executed in this frame. We can then see that this frame is associated with function3(). Because this source code was compiled with debug symbols, the source file name and line number is also displayed. The saved eip is the address that points to the next instruction to be executed in the previous frame in the stack. So for example, if we look at the information for the previous frame in the stack, its eip will be this stack’s saved eip.

eip (扩展指令指针) 地址指向要在此帧中执行的下一个指令。然后, 我们可以看到此帧与 function3 () 相关联。由于此源代码是用调试符号编译的, 因此也会显示源文件名和行号。保存的 eip 是指向要在栈中的上一个帧中执行的下一个指令的地址。例如, 如果我们查看栈中上一个帧的信息, 它的 eip 将是这个栈保存的 eip。

called by frame at 0xbffff370, caller of frame at 0xbffff310

 

called by frame indicates the address of the frame that called this frame. caller of frame indicates which frame the current frame calls. So the called by and caller of basically display the addresses of the two frames that surround the current frame.

所谓按帧指示调用此帧的帧的地址。帧的调用方指示当前帧调用的帧。因此, 调用方和被调用方基本上显示了环绕当前帧的两个帧的地址。

source language c.

 

This line tells us the language in which the program was written.

此行告诉我们编写程序的语言。

Arglist at 0xbffff328, args: string=0xbffff340 "This is a local string"

 

Arglist indicates the address in which the local function variables start. args: displays the arguments passed to this frame. Because this code was compiled with debug symbols, we can see the symbolic name of the argument, string. The address of this variable is also displayed. Note that since this particular variable is itself a local variable, the address appears in the stack frame that called the current frame.

参数列表指示本地函数变量开始的地址。参数: 显示传递到此帧的变量。因为此代码是用调试符号编译的, 所以我们可以看到参数的符号名称, 字符串。此变量的地址也会显示出来。请注意, 由于此特定变量本身是局部变量, 因此地址将出现在调用当前帧的栈帧中。

Locals at 0xbffff328, Previous frame's sp is 0xbffff330

 

Locals displays the address in which the local variables start. Previous frame’s sp displays stack pointer of the previous frame.

本地变量显示了局部变量的起始地址。上一个帧的 sp 显示上一帧的栈指针。

Saved registers:

ebp at 0xbffff328, eip at 0xbffff32c

 

This line displays the values of the ebp and eip registers in the current frame. The eip register is the instruction pointer, and the ebp pointer is the stack base pointer. For more information on these registers and the stack layout, refer to Chapter 5.

此行显示当前帧中 ebp 和 eip 寄存器的值。eip 寄存器是指令指针, ebp 指针是栈基指针。有关这些寄存器和栈布局的详细信息, 请参阅第5章。

6.5.3. Examining Memory and Variables

Besides looking at the stack, looking at the contents of memory and variables is probably the next most useful feature of a debugger. In fact, you will spend most of your time in a debugger looking at variables and/or the contents of memory trying to understand what is going wrong with a process.

除了查看栈之外, 查看内存和变量的内容可能是调试器的下一个最有用的功能。实际上, 您将花费大部分时间在调试器中查看变量和/或内存的内容, 试图了解进程的错误。

6.5.3.1. Variables and Scope and Type

Variables in a C/C++ program have different scope depending on how they were declared. A global variable is a variable that is defined to be externally visible all of the time. A static variable can be declared in the scope of a file or in a function. Static variables are not visible externally and are treated in a special way. An automatic variable is one that is declared inside a function and is only available on the stack while the corresponding function is running and has not finished. Here is a quick overview of how to declare the variables with the three scopes (this is important to understand how to handle each type in GDB).

c/c++ 程序中的变量根据声明的方式有不同的范围。全局变量是定义为在所有时间外部可见的变量。静态变量可以在文件内或函数中声明。静态变量在外部不可见, 并以特殊方式进行处理。自动变量是在函数内声明的, 并且在相应的函数正在运行且尚未完成时,仅在栈上可用。下面是如何用三个作用域声明变量的快速概述 (这对于了解如何处理 GDB 中的每种类型很重要)。

                                                                 Global variable:

int foo = 6 ;

Note: You must declare the variable at the highest scope and outside of any function.

注意: 必须在最大范围和任何函数的外部声明变量。

 

                                                                 Static variable:

static int foo = 6 ;

 

                                                                 Automatic variable:

int function1()

{

  int foo = 6 ;

 

}

 

Global symbols are always available to view in a debugger, although you might not always know the type. Static variables are also always available, but a stripped executable or library may not include the names of the static functions. Function local variables (also known as automatic variables) are only available for printing if the source code is compiled with -g. Building in debug mode provides two things necessary for printing automatic variables. The first is the type information. The second is the debug information for automatic variables, which includes linking them to the type information.

全局符号始终可以在调试器中查看, 尽管您不知道全局变量的类型。静态变量也始终可用, 但剥离的可执行文件或库可能不包括静态函数的名称。函数局部变量 (也称为自动变量) 只有在用-g编译源代码时才可以打印出来. 在调试模式下编译提供了打印自动变量所需的两个条件。第一个是类型信息。第二个是自动变量的调试信息, 调试信息将自动变量链接到类型信息。

Consider a global variable defined as follows:

请考虑如下定义的全局变量:

const char *constString = "This is a constant string!";

 

Printing this from inside GDB for a program that was not compiled with -g will produce the following:

从不是用 -g 编译的程序的GDB 内部打印这个程序,将生成以下内容:

(gdb) print constString

$2 = 134513832

 

GDB can find the global variable, but it does not know its type. As long as this is a base type (that is, not a structure, class, or union), we can still print this properly using the print formatting capabilities:

GDB 可以找到全局变量, 但它不知道它的类型。只要这是基类型 (即不是结构体、类或联合), 我们仍然可以格式功能正确地打印此内容:

(gdb) printf "%s\n", constString

This is a constant string!

(gdb)

 

We were able to print constString as a string because it is a base type and is a global symbol. A local symbol would not be stored in the symbol table and would not reference it without building in debug.

我们能够打印 constString 为字符串, 因为它是一个基类型, 是一个全局符号。本地符号不会存储在符号表中, 并且只有在调试模式中, 才会引用它。

Next, let’s take a look at how to print a static variable. This is similar to a global variable except that there may be more than one static variable with the same name.

接下来, 让我们来看看如何打印静态变量。这与全局变量类似, 只是可能有多个同名的静态变量。

Consider the static function declared at the file level (that is, declared static but outside the scope of a function) as:

考虑在文件级别声明的静态变量 (即声明为静态的, 但在函数的范围之外), 如:

static int staticInt = 5 ;

 

Next, let’s find out how many static functions there are:

接下来, 让我们来看看有多少静态变量:

(gdb) info variable staticInt

All variables matching regular expression "staticInt":

 

Non-debugging symbols:

0x08049544 staticInt

 

Because only one is listed, there is only one. Next, let’s print its value:

因为只有一个被列出, 所以只有一个。接下来, 让我们打印它的值:

(gdb) print /x staticInt

$3 = 0x5

(gdb)

 

Automatic variables are not stored in the process symbol table, meaning that without compiling in -g, automatic variables have no name in the compiled code. Consider the following simple function that defines an integer b:

自动变量不存储在进程符号表中, 这意味着在没有 –g 编译的情况下, 自动变量在编译后的代码中没有名称。请考虑以下定义整数 b 的简单函数:

int foo( int a )

{

  int b = 0 ;

 

...

}

 

If we compile this without debug (that is, without -g), GDB will have no information about this variable name or type.

如果我们没有调试信息编译这个(即, 没有 -g)程序时, GDB 将没有关于这个变量名称或类型的信息。

g++ foo.C -o foo

 

And now in GDB (skipping the first part of the session for clarity)

现在在 GDB (为了清晰起见,跳过会话的第一部分)

(gdb) break foo

Breakpoint 1 at 0x8048422

(gdb) run

Starting program: /home/wilding/src/Linuxbook/foo

 

Breakpoint 1, 0x08048422 in foo(int) ()

(gdb) print b

No symbol "b" in current context.

 

Notice how GDB cannot find any information about the variable at all. This is one of the reasons it is much easier to use GDB when the program is compiled with -g. If the program was built with -g, GDB would be able to find the variable and print its value:

请注意, GDB 根本找不到有关该变量的任何信息。这是与 –g进行程序编译时,更容易使用 GDB的一个原因。如果程序是用 -g 构建的, GDB 将能够找到变量并打印其值:

(gdb) break foo

Breakpoint 1 at 0x8048422: file foo.C, line 30.

(gdb) run

Starting program: /home/wilding/src/Linuxbook/foo

 

Breakpoint 1, foo(int) (a=6) at foo.C:30

30     int b = 0 ;

(gdb) print b

$1 = 1075948688

 

Examining memory and values of variables is another very important aspect of debugging. Most programs allocate memory using malloc or new (the latter for C++) to store variables. Variables that are stored in a heap (such as the one used by malloc or heap), do not have a symbol table, and the compiler does not create any link or any type information for such variables. Such variables do not really have a “scope” and need to be handled specially, as outlined in the section, “Viewing Data in Memory.”

检查变量的内存和值是调试的另一个非常重要的方面。大多数程序使用 malloc 或 new (new用于 c++) 分配内存来存储变量。存储在堆中的变量 (例如, 由 malloc 或堆使用的) 没有符号表, 编译器不会为这些变量创建任何链接或任何类型信息。这些变量实际上没有 "范围", 需要特别处理, 如 "在内存中查看数据" 一节中所述。

6.5.3.2. Print Formatting

Print formatting allows users to change how variables and memory are displayed. For example, sometimes it is more useful to print an integer in hexadecimal format, and other times it helps to see it in decimal format. The most basic way (that is, without formatting) to see the value of a variable is with the print <variable_name> command.

打印格式允许用户更改变量和内存的显示方式。例如, 有时用十六进制格式打印整数更有用, 而其他时侯则可以用十进制格式来查看它。查看变量值的最基本方法 (即不带格式) 使用print <variable_name>命令.

(gdb) print stackVar

$1 = 10

 

You can specify what format you want print to display in. For example, to see it in hex, use the /x argument:

可以指定打印格式。例如, 若要以十六进制查看它, 请使用-x 参数:

(gdb) print /x stackVar

$2 = 0xa

 

Notice how the value of the variable is always preceded by a dollar sign with a number followed by an equal sign. This is because GDB is automatically assigning and storing the value printed to an internal variable of that name. GDB calls these convenience variables. So for example, you can later reuse the newly created $1 variable in other calculations in this debugging session:

请注意, 变量的值总是在前面加上一个美元符号, 后跟一个数字和一个等号。这是因为 GDB 自动分配和存储打印到该名称的内部变量的值。GDB 调用这些变量。例如, 您以后可以在该调试会话的其他计算中重用新创建的$1变量:

(gdb) print $1 + 5

$3 = 15

 

A more powerful alternative to the print command is the printf command. The printf command works much the same as the standard C library function works:

print命令的一个更强大的替代方法是 printf 命令。printf 命令的工作原理与标准 C 库函数的工作原理相同:

(gdb) printf "The value of stackVar in hex is 0x%x\n", stackVar

The value of stackVar in hex is 0xa

 

Printing an array is just as easy. Consider the following array:

int list[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } ;

 

You can print this array in its entirety or just one element:

(gdb) print list

$2 = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

 (gdb) print list[3]

$1 = 3

 (gdb)

 

Notice, though, that the array indexing is C-style with 0 (zero) being the first index value. Thus, the element whose index is 3 is actually the fourth element of the array.

不过, 请注意, 数组索引是 C 风格, 0 (零) 是第一个索引值。因此, 索引为3的元素实际上是数组的第四个元素。

In some cases, you may want to simulate an array with the @ character. Consider the following integer:

在某些情况下, 您可能希望模拟带有 @ 字符的数组。请考虑以下整数:

int globInt = 5 ;

 

We can treat this integer as an array that is three elements in length:

我们可以将此整数视为长度为三个元素的数组:

(gdb) print globInt@3

$16 = {5, 5, 134514112}

 

This is usually useful when dealing with memory that has been allocated in a heap and does not have any specific type according to GDB.

这在处理在堆中分配的内存时通常很有用, 并且 GDB 中没有任何特定类型。

6.5.3.3. Determining the Type of Variable

Occasionally, you’ll want to figure out what kind of variable you’re about to look at. This is mostly useful when looking at global and static variables. When needed, use the whatis command:

有时候, 你会想知道你要看的是什么样的变量。这在查看全局和静态变量时是非常有用的。需要时, 使用 whatis 命令:

(gdb) whatis a

type = int

6.5.3.4. Viewing Data in Memory

As mentioned earlier, if a problem is not compiled with -g, it won’t have any of the debug information such as variable type information. In this case, you’ll probably need to look at memory directly using the examine or x command. This command accepts a number of arguments—/FMT ADDRESS. FMT is a repeat count followed by a format letter and a size letter. ADDRESS specifies the address at which to start the display of data. The following example will dump eight 4-byte hex values starting at 0x08048000:

如前所述, 如果没有使用-g 编译问题, 它将不会有任何调试信息 (如变量类型信息)。在这种情况下, 您可能需要使用 “examine” 或 "x" 命令直接查看内存。此命令接受许多参数-/FMT ADDRESS。FMT是重复计数, 后跟格式字母和大小字母。ADDRESS指定开始显示数据的地址。下面的示例将从0x08048000 开始转储八个4字节十六进制值:

(gdb) x /8xw 0x08048000

0x8048000:   0x464c457f  0x00010101  0x00000000  0x00000000

0x8048010:   0x00030002  0x00000001  0x08048320  0x00000034

 

The GDB help output for constructing an x command is very useful, but for convenience, the most commonly used format codes are shown in the following table.

用于构造 x 命令的 GDB 帮助输出非常有用, 但为了方便起见, 下表显示了最常用的格式代码。

Table 6.1. GDB Format Codes.

Format Description

Format Code

1-byte ASCII character

cb

2-byte decimal integer

dh

4-byte decimal integer

dw

1-byte hexadecimal number

xb

2-byte hexadecimal number

xh

4-byte hexadecimal number

xw

8-byte hexadecimal number

xg

String

s

6.5.3.5. Formatting Values in Memory

If you have a variable in memory and you know the address, you can use casting to print with the correct formatting. Consider the string defined in the preceding example constString. Here are four different methods to print this variable correctly as a string.

如果内存中有变量, 并且知道地址, 则可以使用强制类型转换以正确的格式打印出来。以前面的constString 中定义的字符串为例。下面是四种不同的方法, 可以正确地将该变量打印为字符串。

(gdb) print constString

$5 = 0x80485c0 "This is a constant string!"

  1.   x /s 0x080485c0

0x80485c0 <_IO_stdin_used+28>:   "This is a constant string!"

 (gdb) printf "%s\n",0x080485c0

This is a constant string!

(gdb) print (char *) 0x080485c0

$6 = 0x80485c0 "This is a constant string!"

 

The first requires the shared library or executable that contains this variable to be built with debug information (compiled with -g). Without this debug information, the debugger will not have type information for the symbol constString. The second method uses the GDB examine command (x) with a string format (/s). The third method uses the GDB printf command with a string token (%s). The last method uses a casting feature in GDB to print an address as a specific type. The last three methods do not require any type information (that is, they will work on a program that is not compiled with -g).

第一个要求使用包含此变量的共享库或可执行文件以调试信息生成 (用-g 编译)。如果没有此调试信息, 调试器将不具有符号 constString 的类型信息。第二种方法使用 GDB 检查命令 (x) 与字符串格式 (/s)。第三种方法使用 GDB printf 命令和一个字符串标记 (%s)。最后一种方法使用 GDB 中的强制类型转换功能将地址打印为特定类型。最后三方法不需要任何类型信息 (即, 它们将处理未使用-g 编译的程序)。

The last method is interesting because it uses a C-style cast to tell GDB how to print it. This will work with any type, which makes this very useful when debugging values that are in memory but do not have any direct type information. For example, an application may use malloc or other memory allocation function to get memory to store a variable with a specific type. The compiler will include type information for the variable but will not link it with the memory that was allocated via malloc.

最后一个方法很有趣, 因为它使用 C 风格的强制类型转换来告诉 GDB 如何打印它。这将适用于任何类型, 这使得在调试内存中但没有任何直接类型信息的值时非常有用。例如, 应用程序可以使用 malloc 或其他内存分配函数获取内存以存储具有特定类型的变量。编译器将包括变量的类型信息, 但不会将其与通过 malloc 分配的内存相链接。

The cast formatting method is probably the most powerful method of formatting complex types in GDB. We could even use the address of the constant string constString and cast it to a C++ class type in order to print that region of memory as a C++ object.

转换格式方法可能是 GDB 中格式化复杂类型的最有效方法。我们甚至可以使用常量字符串 constString 的地址, 并将其转换为 c++ 类类型, 以便将该内存区域作为 c++ 对象打印出来。

class myClass

{

  public:

 

  int myVar ;

 

  myClass() {

   myVar = 5 ;

  }

 

};

 

See the section later in this chapter titled “Finding the Address of Variables and Functions” for information on how to find the address of variables like constString. Now we can print the region of memory for the constant string as this class (note that 0x080485c0 is the address of constString):

请参见本章后面的部分, 标题为 "查找变量和函数的地址", 以了解有关如何查找诸如 constString 之类的变量的地址的信息。现在, 我们可以将常量字符串的内存区域打印为类 (请注意, 0x080485c0 是 constString 的地址):

(gdb) print (myClass) *0x080485c0

$26 = {myVar = 1936287828}

 

Of course, this was a completely invalid address, but it shows just how flexible this cast formatting is! Just to prove the point further, though, the value of the member variable myVar is 1936287828 in decimal format and 0x73696854 in hex format which, when translated to a string is sihT or This in reverse (see the man page for ascii for the ascii table). Why is this in reverse? The answer is that 32-bit x86 platforms are “little endian,” as explained later in this chapter.

当然, 这是一个完全无效的地址, 但它显示了这种转换格式是多么灵活!不过, 为了进一步证明这一点, 成员变量 myVar 的值为十进制格式的 1936287828, 十六进制格式0x73696854, 当转换为字符串时 sihT 或反转 (请参见 ascii 表 ascii 的帮助手册)。为什么会反过来呢?答案是, 32 位 x86 平台是 "little endian", 正如本章后面所解释的那样。

6.5.3.6. Changing Variables

GDB also allows you to make any changes you wish to variables and registers. To set the value of a variable, use the set variable command, or simply set:

GDB 还允许您对变量和寄存器进行更改。若要设置变量的值, 请使用 set 变量命令, 或者简单地set:

(gdb) set variable a=5

 

or

(gdb) set a=5

(gdb) print a

$1 = 5

 

You can set the value of a register by referencing it the same way you reference it when printing its value:

您可以通过引用与在打印其值时引用它的同样的方式来设置寄存器的值:

(gdb) set $eax=1

(gdb) print $eax

$1 = 1

Warning: Changing the value of a register without understanding what it is for will cause unpredictable behavior.

警告: 更改寄存器的值而不理解它的作用, 将导致不可预知的行为。

6.5.4. Register Dump

Examining the contents of the registers in a live debugging session may be necessary to diagnose complex problems, especially when you don’t have the source code. Looking at a raw register dump can be like looking at a wall of hieroglyphics if you don’t have experience with assembly language on the platform you are using. The contents of the registers make sense only when you examine and understand the assembly instructions (the human readable format of machine instructions) that are using the registers. Assembly instructions directly manipulate the memory and registers of a computer. When debugging a program that has been compiled with debug symbols enabled (-g), looking at register contents is usually not necessary. However, when debugging a program that has no debug symbols, you are forced to work at the assembly level.

The command used to see a register dump in GDB is “info registers” as shown here:

在实时调试会话中检查寄存器的内容可能对诊断复杂问题很必要, 尤其是在没有源代码的情况下。如果你没有在平台上使用汇编语言的经验, 那么看一个原始寄存器转储就像看一堵象形文字墙一样。只有当您检查和理解使用寄存器的汇编指令 (可读的机器指令) 时, 寄存器的内容才有意义。汇编指令直接操作计算机的内存和寄存器。在调试已启用调试符号 (-g) 编译的程序时, 通常不需要查看寄存器内容。但是, 在调试没有调试符号的程序时, 必须在汇编程序级别上工作。用于在 GDB 中查看寄存器转储的命令是 "info registers", 如下所示:

(gdb) info registers

eax      0x6   6

ecx      0x1   1

edx      0x4015c490     1075168400

ebx      0x4015afd8     1075163096

esp      0xbffff3a0     0xbffff3a0

ebp      0xbffff3a8     0xbffff3a8

esi      0x40018420     1073841184

edi      0xbffff3f4     -1073744908

eip      0x8048340     0x8048340

eflags     0x200386  2098054

cs       0x23   35

ss       0x2b   43

ds       0x2b   43

es       0x2b   43

fs       0x0   0

gs       0x0   0

(gdb)

 

We know from the procedure calling conventions on x86 (see the “Procedure Calling Conventions” section in Chapter 5 for more information) that eax is used to store the return value from a function call. We don’t know for sure, but a possibility here is that a function will return the value of 6. However, without seeing the previously executed assembly instructions, we really don’t know what has been executed to bring the registers to the state we see above. However, there is some interesting information that we can pick out of this dump.

我们从 x86 的过程调用约定 (参见第5章中的 "过程调用约定" 一节中了解更多信息), eax 用于存储从函数调用返回值。我们不确定, 但这里的一个可能性是函数将返回值6。但是, 如果没有看到以前执行的汇编指令, 我们真的不知道已经执行了哪些操作,将寄存器带到我们上面看到的状态。然而, 但是,我们可以从这个转储中挑选出一些有趣的信息。

The eip register is the instruction pointer (a.k.a program counter) and always contains the address of the current instruction in memory that will be executed. Memory addresses that are close to the eip above of 0x8048340 will become familiar the more debugging and problem determination you do on Linux for 32-bit x86 hardware. Executables always get mapped into a process’ address space at 0x08048000 on this hardware, and so instructions in this range are very common. Refer to the /proc/<pid>/maps section in Chapter 3 for more information on a process’ address space.

eip 寄存器是指令指针 (即程序计数器), 并且始终包含在内存中将执行的当前指令的地址。靠近0x8048340 上面的 eip 的内存地址将变得熟悉, 您在 Linux 上为32位 x86 硬件进行的越多的调试和问题确定,就对此越熟悉。可执行文件始终被映射到0x08048000 上的进程地址空间, 因此此范围内的指令非常常见。有关进程地址空间的详细信息, 请参阅3章中的/proc/<pid>/maps部分.

One final observation we can make from the register dump has to do with the values stored in the ebp and esp registers. Addresses near 0xbffff3a0 and 0xbffff3a8 for the registers ebp and esp will also become familiar as you become more accustomed to the address space layout. The ebp and esp registers are used to control the stack and the stack segment on 32-bit x86 hardware and are usually located around the 0xbfffffff address range.

我们从寄存器转储中进行的最后一个观察与存储在 ebp 和 esp 寄存器中的值有关。0xbffff3a0 和0xbffff3a8 附近的地址对于寄存器 ebp 和 esp 也将变得熟悉, 因为您变得更习惯于地址空间布局。ebp 和 esp 寄存器用于在32位 x86 硬件上控制栈和栈段, 并且通常位于0xbfffffff 地址范围附近。

Note: The meaning of each register is well beyond the scope of this chapter and will not be covered here (http://linuxassembly.org/ may be a good reference for the interested reader).

注: 每个寄存器的含义远远超出本章的范围, 不会在此处涵盖 (感兴趣的读者可以参考http://linuxassembly.org/)。

 

If you do end up looking at a register dump, it will probably be for a trap that occurred for a process that was not built with -g. In this case, you’ll probably look at the instruction that trapped and try to understand why an address that is stored in a register used by that instruction is invalid. If your program is getting a segmentation violation (SIGSEGV), it is very likely that the trapped instruction is dereferencing a bad pointer. For example, an address of 0x6 or 0x0 is outside the range of any memory segment and will result in a segmentation violation.

如果最后看到的是寄存器转储, 则可能是针对未使用 -g 生成的进程所发生的陷阱。在这种情况下, 您可能会查看被捕获的指令, 并尝试了解该指令所使用的寄存器中存储的地址是否无效。如果程序正在获得分段冲突 (SIGSEGV), 则很可能是被捕获的指令取消引用了错误的指针。例如, 0x6 或0x0 的地址超出了内存段的范围, 并将导致分段冲突。

64-bit computing is becoming more and more mainstream, and it is worth covering some of the differences between 32-bit and 64-bit as far as registers are concerned. The following dump was performed on the x86-64 architecture and was done at the exact same point in a test program as the preceding dump:

64位计算正变得越来越主流,就寄存器而言,值得涵盖32位和64位之间的一些差异。以下转储是在x86-64体系结构上执行的,并且在测试程序中与前面的转储完全相同的时间点完成:

Code View: Scroll / Show All

(gdb) info registers

rax      0x6   6

rbx      0x7fbfffea48    548682066504

rcx      0x40000300     1073742592

rdx      0x7fbfffea58    548682066520

rsi      0x4   4

rdi      0x2   2

rbp      0x7fbfffe9e0    0x7fbfffe9e0

rsp      0x7fbfffe9d0    0x7fbfffe9d0

r8       0x40000488     1073742984

r9       0x2a955604d0    182894068944

r10      0x0   0

r11      0x2a956a2d40    182895390016

r12      0x40000300     1073742592

r13      0x1   1

r14      0x2a95879308    182897316616

r15      0x4000041a     1073742874

rip      0x4000043b     0x4000043b <main+33>

eflags     0x306  774

ds       0x33   51

es       0x2b   43

fs       0x0   0

gs       0x0   0

fctrl    0x0    0

fstat    0x0    0

ftag     0x0    0

fiseg    0x0    0

fioff    0x0    0

foseg    0x0    0

fooff    0x0    0

fop     0x0    0

xmm0     {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

xmm1     {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

xmm2     {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

xmm3     {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

xmm4     {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

xmm5     {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

xmm6     {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

xmm7     {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

xmm8     {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

xmm9     {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

xmm10    {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

xmm11    {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

xmm12    {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

xmm13    {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

xmm14    {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

xmm15    {f = {0x0, 0x0, 0x0, 0x0}}     {f = {0, 0, 0, 0}}

mxcsr    0x1f80 8064

(gdb)

 

Note the number of registers and their different names. On different architectures the available registers and their naming will be different. Notice also that many of the values contained in the registers are much larger than the maximum 32-bit value of 0xffffffff.

请注意寄存器的数量及其不同的名称。在不同的体系结构中, 可用寄存器及其命名将有所不同。还要注意, 寄存器中包含的许多值远远大于32位的最大值0xffffffff。

To see the complete listing of registers including all floating point and extended registers, use the info all-registers GDB command.

要查看所有浮点和扩展寄存器在内的寄存器的完整列表, 请使用 "info all-registers” GDB 命令。

Last, you can display the value of individual registers by referring to them by name with a $ prepended:

最后, 您可以通过按名称(前缀$)访问这些寄存器的来显示各个寄存器的值:

(gdb) print $eax

$1 = 0

 

Most of the time, you’ll just need to know the value of a particular register as it is used by the assembly language. For more information on the register conventions for each hardware platform, refer to the corresponding vendor documentation. Each hardware platform will have different registers and assembly language.

大多数情况下, 您只需要知道特定寄存器在汇编语言使用时的值。有关每个硬件平台的寄存器约定的详细信息, 请参阅相应的供应商文档。每个硬件平台都有不同的寄存器和汇编语言。

Note: Okay, so far in this chapter we’ve covered a lot of basics such as attaching to a process, looking at data, displaying register values, and so on. As we get further into this chapter, the focus will shift more and more from usage to examples. The next section is where the transition starts.

注意: 好的, 到目前为止, 在本章中我们已经讨论了很多基础知识, 例如附加到一个进程、查看数据、显示寄存器值等等。随着本章的深入, 焦点将越来越多地从用法转移到示例。下一节是转换开始的位置。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章