6.6. Execution

6.6. Execution

The GDB execution commands give you full control over a process. Of course, these commands do not work when using a core file because there is no live process to control. This section of the chapter will introduce some of the most commonly used commands for controlling a live program. It will also present some of the more advanced and lesser known tips and tricks that can make your debugging session more efficient.

GDB 執行命令使您可以完全控制進程。當然, 當使用core文件時, 這些命令不起作用, 因爲沒有要控制的實時進程。本章的這一部分將介紹一些最常用的用於控制實時程序的命令。它還將介紹一些更高級和更鮮爲人知的提示和技巧,可以使您的調試更有效。

6.6.1. The Basic Commands

The following table summarizes the most common and most useful execution control commands:

下表總結了最常用和最有用的執行控制命令:

Note: <N> represents a number. The number instructs GDB to run the command a certain number of times. For example, next 5 will perform the next command 5 times.

注意:<N>表示一個數字。數字指示 GDB 運行命令的次數。例如, 接下來的5將執行下一命令5次.

 

The commands in Table 6.2 are pretty self-explanatory, but there are some settings you can use in GDB that affect the way some of these commands work. The next sections describe some of these in more detail.

表6.2中的命令非常明顯,但是您可以在GDB中使用一些設置來影響其中一些命令的工作方式。 接下來的部分將更詳細地介紹其中的一些內容。

Table 6.2. Basic GDB Execution Commands.

Action

Command

Notes

Execute the next source code line and do not go into functions

next <N>

Requires debug symbols and source code. The optional argument sets the number of lines of source code to step through. Functions will be skipped over (but called under the covers).

Execute the next source code line and do go into functions

step <N>

Requires debug symbols and source code. The optional argument sets the number of lines of source code to step through.

Execute the next assembly instruction and do not go into functions

nexti <N>

The optional argument sets the number of instructions to step through. Function will be skipped over (but called under the covers).

Execute the next assembly instruction and do go into functions

stepi <N>

The optional argument sets the number of instructions to step through.

Continue full execution

continue

 

Continue execution until the current stack frame returns

finish

 

Continue execution at a specific address

jump <address>

Use with caution. Jumping directly to an address can cause unexpected behavior given that the registers may not have been set properly to run the target instruction.

Continue execution until a source line or location greater than the current location or the location specified

until

 

Call a function in the program

call <function>

 

Manually stop execution of the program

CTRL-C

 

6.6.1.1. Notes on stepi

The stepi command allows you to execute a single assembly instruction. You can also execute any number of instructions by specifying that number as an argument to stepi. Stepping through each and every assembly instruction can often reveal very low-level actions that normally take place unnoticed in a program. Again using the dyn program as an example (what dyn itself actually does is not important for this demonstration), let’s use the stepi command to step into the call to printf:

stepi 命令允許您執行單個彙編指令。還可以通過將該數字指定爲 stepi 的參數來執行任意數量的彙編指令。逐句通過每個彙編指令通常可以顯示程序中不被注意的低級操作。再次使用 dyn 程序作爲示例 (dyn 本身實際上對此演示並不重要), 讓我們使用 stepi 命令逐步進入對 printf 的調用:

Code View: Scroll / Show All

(gdb) break main

Breakpoint 1 at 0x804841c: file dyn_main.c, line 6.

(gdb) run

Starting program: /home/dbehman/book/code/dyn

 

Breakpoint 1, main () at dyn_main.c:6

6          void *dlhandle = NULL;

(gdb) next

9          printf( "Dynamically opening libdyn.so ...\n" );

(gdb) stepi

0x08048426     9        printf( "Dynamically opening libdyn.so ...\n"

);

(gdb) stepi

0x0804842b     9        printf( "Dynamically opening libdyn.so ...\n"

);

(gdb) stepi

0x08048330 in ?? ()

(gdb) stepi

0x08048336 in ?? ()

(gdb) stepi

0x0804833b in ?? ()

(gdb) stepi

0x08048300 in ?? ()

(gdb) stepi

0x08048306 in ?? ()

(gdb) stepi

0x4000d280 in _dl_runtime_resolve () from /lib/ld-linux.so.2

(gdb) stepi

0x4000d281 in _dl_runtime_resolve () from /lib/ld-linux.so.2

(gdb) stepi

0x4000d282 in _dl_runtime_resolve () from /lib/ld-linux.so.2

(gdb) stepi

0x4000d283 in _dl_runtime_resolve () from /lib/ld-linux.so.2

(gdb) stepi

0x4000d287 in _dl_runtime_resolve () from /lib/ld-linux.so.2

(gdb) stepi

0x4000d28b in _dl_runtime_resolve () from /lib/ld-linux.so.2

(gdb) stepi

0x4000cf90 in fixup () from /lib/ld-linux.so.2

(gdb) stepi

0x4000cf91 in fixup () from /lib/ld-linux.so.2

(gdb) stepi

0x4000cf93 in fixup () from /lib/ld-linux.so.2

(gdb) stepi

0x4000cf94 in fixup () from /lib/ld-linux.so.2

(gdb)

 

As you can see, several other things happen before we even enter the printf function. The functions we see here, _dl_runtime_resolve and fixup, are found in /lib/ld-2.3.2.so, which is the dynamic linker library. The “??” marked functions are GDB’s way of saying that it couldn’t find the function name for the corresponding stack frame. This often happens with special cases, such as when the dynamic linker is involved.

正如您所看到的, 在我們進入 printf 函數之前, 還會發生其他一些事情。我們在這裏看到的函數, _dl_runtime_resolve 和fixup, 發現在/lib/ld-2.3.2.so, 這是動態鏈接器庫。"??" 標記的函數是 GDB 的說法, 它無法找到相應棧幀的函數名。這通常發生在特殊情況下, 如涉及動態鏈接器時。

6.6.2. Settings for Execution Control Commands

6.6.2.1. Step-mode

The step-mode allows you to control how the step command works with functions that do not have any debug symbols. Let’s use an example program called dyn. It is compiled with debug symbols; however, it calls the standard C library function printf(), which by default has no debug symbols. Using step in the default manner, the following will occur:

通過step-mode, 您可以控制step命令與沒有任何調試符號的函數一起工作。讓我們使用一個名爲 dyn 的示例程序。它是用調試符號編譯的;但是, 它調用標準的 C 庫函數 printf (), 默認情況下沒有調試符號。以默認方式使用step, 將發生以下情況:

(gdb) break main

Breakpoint 1 at 0x804841c: file dyn_main.c, line 6.

(gdb) run

Starting program: /home/dbehman/book/code/dyn

 

Breakpoint 1, main () at dyn_main.c:6

6          void *dlhandle = NULL;

(gdb) next

9          printf( "Dynamically opening libdyn.so ...\n" );

(gdb) step

Dynamically opening libdyn.so ...

11         dlhandle = dlopen( "./libdyn.so", RTLD_NOW );

(gdb)

 

As you can see, the step command did not go into the printf function because it does not have debug symbols. To override this behavior, we can use the “step-mode” setting in GDB:

正如您所看到的, step命令沒有進入 printf 函數, 因爲它沒有調試符號。要重寫此行爲, 我們可以使用 GDB 中的 "step-mode" 設置:

(gdb) break main

Breakpoint 1 at 0x804841c: file dyn_main.c, line 6.

(gdb) run

Starting program: /home/dbehman/book/code/dyn

 

Breakpoint 1, main () at dyn_main.c:6

6          void *dlhandle = NULL;

(gdb) show step-mode

Mode of the step operation is off.

(gdb) set step-mode on

(gdb) show step-mode

Mode of the step operation is on.

(gdb) next

9          printf( "Dynamically opening libdyn.so ...\n" );

(gdb) step

0x4007eab0 in printf () from /lib/i686/libc.so.6

(gdb)

 

As shown in this example, the step command did enter the printf function even though printf was not built with -g (that is, it has no debug information). This special step-mode can be useful if you want to step through the assembly language of a call function regardless of whether it was built in debug (that is, built with -g) or not.

如本示例所示, 儘管 printf 不是用 -g 生成的 (即沒有調試信息), 但step命令確實進入了 printf 函數。如果要step遍歷調用函數的彙編語言, 而不管它是否帶調試符號 (即, 以 -g 生成), 則此特殊的step-mode很有用。

6.6.2.2. Following fork Calls

If the controlled program uses fork or vfork system calls, you can tell GDB that you want to handle the fork calls in a few different ways. By default, GDB will follow the parent process (the one you are controlling now) and will let the child process (the newly created process) run free. The term follow here means that GDB will continue to control a process, and the term run free means that GDB will detach from the process, letting it run unimpeded and unaffected by GDB. If you want GDB to follow the child process, use the command set follow-fork-mode child before the fork occurs. An alternative is to have GDB ask you what you want to do when a fork is encountered. This can be done with the command set follow-fork-mode ask. You can view the current setting with the command show follow-fork-mode.

如果受控程序使用fork或 vfork 系統調用, 您可以告訴 GDB 您想用幾種不同的方式處理fork調用。默認情況下, GDB 將跟蹤父進程 (現在正在控制的進程), 並允許子進程 (新創建的進程) 自由運行。這裏的跟蹤意味着 gdb 將繼續控制父進程, 而這個自由運行意味着 GDB 將從這個進程中分離出來, 讓它不受影響地運行, 並不會受到 GDB 的干擾。如果希望 GDB 跟蹤子進程, 請在fork發生之前使用命令set follow-fork-mode child。另一種方法是讓 GDB 問你在遇到fork時要做什麼。這可以用命令set follow-fork-mode ask來完成。您可以使用命令show follow-fork-mode 查看當前設置。

Let’s assume that you are controlling a process that forks two children, and you want to eventually be in control of a “grandchild” process as shown in Figure 6.2.

假設您正在控制一個fork兩個子進程的進程, 並且您希望最終控制 "孫" 進程, 如圖6.2 所示。

Figure 6.2. Children and granchildren of a process.

 

 

You cannot get control of process 4 (as in the diagram) using the default GDB mode for following forks. In the default mode, GDB would simply keep control of process 1, and the other processes would run freely. You can’t use set follow-fork-mode child because GDB would follow the first forked process (process 2), and the parent process (process 1) would run freely. When process 1 forked off its second child process, it would no longer be under the control of GDB. The only way to follow the forks properly to get process 4 under the control of GDB is to use set follow-fork-mode ask. In this mode, you would tell GDB to follow the parent after the first fork call and to follow the child process for the next two forks.

使用以下fork, 默認 GDB 模式無法獲得對進程 4 (如圖所示) 的控制。在默認模式下, GDB 只會保持對進程1的控制, 而其他進程則可以自由運行。由於 GDB 將跟蹤第一個fork過程 (流程 2), 並且父進程 (進程 1) 可以自由運行, 因此不能使用 set follow-fork-mode子進程。當進程1 fork第二個子進程時, 它將不再受 GDB 的控制。在 GDB 的控制下, 正確跟蹤fork的唯一方法是使用 set follow-fork-mode ask。在這種模式下, 您會告訴 GDB 在第一個fork調用之後跟蹤父進程, 並跟蹤產生兩個fork的子進程。

6.6.2.3. Handling Signals

Linux applications often receive various types of signals. Signals are a software convention similar to a very trivial message (the message being the signal number itself). Each signal has a different purpose, and GDB can be set to handle each in a different way. Consider the following simple program that sends itself a SIGALRM:

Linux 應用程序通常接收各種類型的信號。信號是一種類似於非常瑣碎的信息的軟件約定 (消息是信號本身)。每個信號都有不同的目的, 而 GDB 可以設置成不同的處理方式。請考慮下面的簡單程序, 在這個程序中,發送自己 SIGALRM:

penguin> cat alarm.C

#include <stdio.h>

#include <unistd.h>

#include <sys/utsname.h>

 

int main()

{

  alarm(3) ;

 

  sleep(5) ;

 

  return 0 ;

}

 

This simple program calls the alarm system call with an argument of 3 (for three seconds). This will cause a SIGALRM to be sent to the program in three seconds. The second call is to sleep with an argument of 5 and causes the program to sleep for five seconds, two seconds longer than it will take to receive the SIGALRM. Running this program in GDB has the following effect:

這個簡單的程序調用alarm系統調用的參數爲 3 (三秒)。這將導致在三秒內將 SIGALRM 發送到該程序。第二個調用是以5爲參數的sleep, 使程序休眠五秒, 比接收 SIGALRM 要長兩秒。在 GDB 中運行此程序具有以下效果:

(gdb) run

Starting program: /home/wilding/src/Linuxbook/alarm

Program terminated with signal SIGALRM, Alarm clock.

The program no longer exists.

 

The program was killed off because it did not have a handler installed for SIGALRM. Signal handling is beyond the scope of this chapter, but we can change how signals such as SIGALRM are handled through GDB. First, let’s take a look at how SIGALRM is handled in GDB by default:

程序被殺掉是因爲它沒有爲 SIGALRM 安裝處理程序。信號處理超出了本章的範圍, 但我們可以通過 GDB 來改變諸如 SIGALRM 等信號的處理方式。首先, 讓我們來看看 SIGALRM 是如何在 GDB 中處理的:

(gdb) info signals SIGALRM

Signal    Stop   Print  Pass to program Description

SIGALRM    No    No   Yes       Alarm clock

 

The first column is the signal (in this case SIGALRM). The second column indicates whether GDB will stop the program and hand control over to the user of GDB when the signal is encountered. The third column indicates whether GDB will print a message to the screen when a signal is encountered. The fourth column indicates whether GDB will pass the signal to the controlled program (in the preceding example, the program is the alarm program that was run under the control of GDB). The last column is the description of the signal.

第一列是信號 (在本例中爲 SIGALRM)。第二列表示 GDB 在遇到信號時是否會停止該程序並將其控制到 GDB 用戶。第三列表示在遇到信號時, GDB 是否會將消息打印到屏幕上。第四列表示 GDB 是否將信號傳遞給受控程序 (在前面的示例中, 程序是在 GDB 控制下運行的警報程序)。最後一列是對信號的描述。

According to the output, GDB will not stop when the controlled program encounters a SIGALRM, it will not print a message, and it will pass the signal to the program. We can tell GDB to not pass the signal to the process by using handle SIGALRM nopass:

根據輸出, 當受控程序遇到 SIGALRM 時, GDB 不會停止, 它不會打印消息, 並且會將信號傳遞給程序。通過使用處理器 SIGALRM nopass, 我們可以告訴 GDB 不傳遞信號到該過程:

(gdb) handle SIGALRM nopass

Signal    Stop   Print  Pass to program Description

SIGALRM    No    No   No       Alarm clock

(gdb) run

Starting program: /home/wilding/src/Linuxbook/alarm

 

Program exited normally.

(gdb)

 

In the preceding example, the program slept for the full five seconds and did not receive the SIGALRM. Next, let’s tell GDB to stop when the controlled process receives SIGALRM:

在前面的示例中, 程序休眠了整整五秒, 沒有收到 SIGALRM。接下來, 讓我們告訴 GDB 在受控進程收到 SIGALRM 時停止:

(gdb) handle SIGALRM stop

Signal    Stop   Print  Pass to program Description

SIGALRM    Yes    Yes   Yes       Alarm clock

(gdb) run

Starting program: /home/wilding/src/Linuxbook/alarm

 

Program received signal SIGALRM, Alarm clock.

0x4019fd01 in nanosleep () from /lib/libc.so.6

(gdb) bt

#0 0x4019fd01 in nanosleep () from /lib/libc.so.6

#1 0x4019fbd9 in sleep () from /lib/libc.so.6

#2 0x080483e3 in main ()

#3 0x4011a4a2 in __libc_start_main () from /lib/libc.so.6

 

In this case, GDB stopped when the controlled program received a signal and handed control back to the GDB user (that is, gave us a prompt). The command bt was run in the example here to display the stack trace and to show that the signal was received while in sleep (specifically in nanosleep) as expected.

在這種情況下, 當受控程序收到一個信號時,GDB 停止了。並把控制交給 GDB 用戶 (即, 給我們一個提示)。在示例中運行了 bt 命令來顯示堆棧跟蹤, 並顯示在睡眠中 (特別是在 nanosleep) 中接收到的信號。

If the process receives a lot of signals and you just want to keep track of when it receives a signal (and not take any actions), we can tell GDB to print a message every time the controlled process receives a signal:

如果這個進程收到了很多信號, 你只是想跟蹤它收到信號 (而不採取任何行動), 我們可以告訴 GDB 在每次受控進程收到信號時都打印一條消息:

(gdb) run

The program being debugged has been started already.

Start it from the beginning? (y or n) y

Starting program: /home/wilding/src/Linuxbook/alarm

 

Program received signal SIGALRM, Alarm clock.

 

The program will continue to handle the signal as it was designed to, and GDB will simply report a message each time a SIGALRM is received. Sometimes this mode can be useful to see how often a process times out or when setting a breakpoint in a function that occurs after a signal is received by the process.

該程序將繼續處理的信號, 正如程序所設計的。每次收到 SIGALRM,GDB 將只報告一個消息。有時, 此模式有助於查看進程超時的頻率, 或在進程接收到信號後調用的函數中設置斷點。

To see the full list of signals and how GDB is configured to handle them, use the info signals command:

要查看信號的完整列表以及GDB如何配置來處理它們, 請使用 "info signals" 命令:

(gdb) info signals

Signal    Stop   Print   Pass to program Description

 

SIGHUP           Yes     Yes   Yes       Hangup

SIGINT           Yes     Yes   No        Interrupt

SIGQUIT          Yes     Yes   Yes       Quit

SIGILL           Yes     Yes   Yes       Illegal instruction

SIGTRAP          Yes     Yes   No        Trace/breakpoint trap

SIGABRT          Yes     Yes   Yes       Aborted

SIGEMT           Yes     Yes   Yes       Emulation trap

SIGFPE           Yes     Yes   Yes       Arithmetic exception

SIGKILL          Yes     Yes   Yes       Killed

SIGBUS           Yes     Yes   Yes       Bus error

SIGSEGV          Yes     Yes   Yes       Segmentation fault

SIGSYS           Yes     Yes   Yes       Bad system call

SIGPIPE          Yes     Yes   Yes       Broken pipe

SIGALRM          No      No    Yes       Alarm clock

   ...

Note: “...” in this output is not part of the output but is used to show that the output is longer than what is printed here.

注: "..." 在此輸出中不是輸出的一部分, 而是用於顯示輸出比此處打印的長。

 

You can tell GDB to handle each signal differently to match the desired functionality for the problem you are investigating.

您可以告訴 GDB 以不同的方式處理每個信號, 以匹配正在調查的問題。

6.6.3. Breakpoints

Breakpoints are a method to stop the execution of a program in a function at a particular point in the code or on a particular condition. We’ve been using breakpoints throughout this chapter, which goes to show how common they are. To see the current list of breakpoints set, use the “info breakpoints” command.

斷點是在代碼中或特定條件下, 在函數中停止執行某個程序的方法。我們在本章中使用了斷點, 這說明了它們是多麼的常見。要查看當前設置的斷點列表, 請使用 "info breakpoints" 命令。

(gdb) break main

Breakpoint 1 at 0x80483c2

(gdb) info break

Num Type      Disp Enb Address  What

1  breakpoint    keep y  0x080483c2 <main+6>

 

The breakpoint in the preceding example is set to stop the controlled program in the function main. This is the most common usage of breakpoints—that is, to stop in a particular function. The incantation for this is:

前面示例中的斷點設置爲在函數 main 中停止受控程序。這是斷點的最常見用法, 即在特定函數中停止。這個方法是:

break <function name>

 

It can also be useful to set a breakpoint in a function only when one of the function parameters is a specific value. Say you had a function in your application that got called hundreds of times, but you’re only interested in examining this function when one of the parameters is a specific value. Say the function is called common_func and takes one integer parameter called num. To set up a conditional breakpoint on this function when num equals 345 for example, you would first set the breakpoint:

僅當函數參數之一是特定值時, 纔在函數中設置斷點也很有用。假設您的應用程序中有一個調用了成百上千次的函數, 但只有當其中一個參數是特定值時, 纔會對該函數進行檢查。假設函數名爲 common_func, 並使用一個稱爲 num 的整數參數。設置此函數的條件斷點, 在 num 等於345時停止。 請首先設置斷點:

(gdb) break common_func

Breakpoint 2 at 0x8048312: file common.c, line 3.

 

Now that the breakpoint is set, use the condition command to define a condition for this newly set breakpoint. We reference the breakpoint by number, in this case, 2.

現在已設置斷點, 請使用condition命令爲新設置的斷點定義條件。我們按數字引用斷點, 在本例中爲2。

(gdb) condition 2 num == 345

 

Notice the double equal signs—this is the same notation you would use for an expression in C programming.

注意兩個等號-這是在 C 程序中用於表示相等的符號。

Verify the correct setting of the breakpoint with the info breakpoint command:

使用 " info breakpoint " 命令驗證斷點的設置:

(gdb) info breakpoint

Num Type           Disp Enb Address    What

1   breakpoint     keep y   0x0804832a in main at common.c:10

2   breakpoint     keep y   0x08048312 in common_func at common.c:3

        stop only if num == 345

 

When continuing program execution, breakpoint number 2 will only be triggered when the value of num is 345:

在繼續執行程序時, 只有當 num 值爲345時纔會觸發斷點號 2:

(gdb) cont

Continuing.

 

Breakpoint 2, common_func (num=345) at common.c:3

3          int foo = num;

 

If the program was compiled with -g, you can set a breakpoint on a particular line of code as in the next example:

如果程序是使用-g 編譯的, 則可以在特定代碼行上設置斷點, 如下所示:

(gdb) break hang2.C:20

Breakpoint 1 at 0x8048467: file hang2.C, line 20.

(gdb) run user

Starting program: /home/wilding/src/Linuxbook/hang2 user

 

Breakpoint 1, main (argc=2, argv=0xbfffefc4) at hang2.C:20

20     if ( !strcmp( argv[1], "user" ) )

(gdb)

 

If you don’t have the source code, you can also still set a breakpoint at a specific address as follows:

如果沒有源代碼, 還可以在特定地址設置斷點, 如下所示:

(gdb) break * 0x804843c

Breakpoint 2 at 0x804843c

(gdb) cont

Continuing.

 

Breakpoint 2, 0x0804843c in main ()

(gdb)

 

Notice the * in front of the address; this is used when the argument to the break command is an address.

注意地址前面的 *;當break命令的參數是地址時, 將使用此方法。

To delete a breakpoint, use the delete command with the breakpoint number as the argument.

若要刪除斷點, 請使用帶有breakpoint號的 delete 命令作爲參數。

6.6.4. Watchpoints

Watchpoints, as the name implies, are for watching data in your program, especially for alerting you when data at a specific address changes value. If you have a variable in your program that is getting changed for some bizarre and unknown reason, this could be a symptom of memory corruption. Memory corruption problems can be extremely difficult to track down given that it can happen long before the symptom (for example, trap or unexpected behavior). With a watchpoint, you can tell GDB to watch the specified variable and let you know immediately when it changes. For memory corruption, GDB will tell you exactly where and when the corruption occurs so you can easily fix the problem.

顧名思義, 觀察點是用於在程序中監視數據, 特別是當特定地址的數據更改時提醒您。如果程序中的某個變量因某種奇怪和未知的原因而改變, 這可能是內存損壞的症狀。由於在出現症狀之前可能會發生很長時間 (例如, 陷阱或意外行爲), 內存損壞問題可能非常困難。使用 watchpoint, 您可以告訴 GDB 監視指定的變量, 並在更改時立即通知您。對於內存損壞, GDB 將準確地告訴您發生損壞的地點和時間, 以便您可以輕鬆地解決問題。

There are two kinds of watchpoints—hardware and software. The x86 hardware, for example, provides built-in support specifically for watchpoints, and GDB will make use of this support. If the support does not exist or if the conditions for the use of the hardware are not met, GDB will revert to using software watchpoints. A software watchpoint is much slower than a hardware watchpoint. The reason for this is because GDB must stop the program execution after each assembly instruction and examine every watchpoint for changes. Conversely, hardware watchpoints allow GDB to run normally but will instantly notify GDB of a change when/if it occurs.

有兩種觀察點:硬件和軟件。例如, x86 硬件爲觀察點提供了內置的支持, GDB 將利用此支持。如果不支持或硬件使用條件不滿足, GDB 將恢復使用軟件觀察點。軟件 watchpoint 比硬件 watchpoint 慢得多。原因是, GDB 必須在每個彙編指令之後停止程序執行, 並檢查每個 watchpoint 的更改。相反, 硬件觀察點允許 gdb 正常運行, 但會立即通知 gdb 的觀察到值的變化,如果它發生。

To demonstrate watchpoints, let’s use a simple program that simulates an employee record system. The source code is:

爲了演示觀察點, 讓我們使用一個模擬員工記錄系統的簡單程序。源代碼是:

Code View: Scroll / Show All

#include <stdio.h>

 

struct employee

{

   char name[8];

   int  serial_num;

};

 

void print_employee_rec( struct employee rec )

{

   printf( "Name: %s\n", rec.name );

   printf( "Number: %d\n", rec.serial_num );

 

   return;

}

 

void update_employee_name( struct employee *rec, char *name )

{

   strcpy( rec->name, name );

 

   return;

}

 

void add_employee( struct employee *rec, char *name, int num )

{

   strcpy( rec->name, name );

   rec->serial_num = num;

 

   return;

}

 

int main( void )

{

   struct employee rec;

 

   add_employee( &rec, "Fred", 25 );

 

   print_employee_rec( rec );

 

   printf( "\nUpdating employee's name ...\n\n" );

 

   update_employee_name( &rec, "Fred Smith" );

 

   print_employee_rec( rec );

 

   return 0;

}

 

The basic flow of the program is to create an employee record with the name “Fred” and serial number 25. Next, the program updates the employee’s name to “Fred Smith” but does not touch the serial number. Running the program produces this output:

程序的基本流程是創建一個名爲 "弗雷德" ,序列號25的僱員記錄。接下來, 該程序將僱員的姓名更新爲 "弗雷德 Smith", 但不觸及序列號。運行該程序將生成此輸出:

penguin> employee

Name: Fred

Number: 25

 

Updating employee's name ...

 

Name: Fred Smith

Number: 26740

 

If the program isn’t supposed to update the serial number when the name is changed, then why did it change to 26740? This kind of error is indicative of memory corruption. If you’ve examined the source code, you might already know what the problem is, but let’s use GDB and watchpoints to tell us what the problem is. We know that something bad happens after printing out the employee record the first time, so let’s set a watchpoint on the serial_num member of the structure at that point:

如果程序不應該在更改名稱時更新序列號, 那麼爲什麼它會更改爲26740?此類錯誤表明內存損壞。如果您已經檢查了源代碼, 您可能已經知道問題是什麼, 但讓我們使用 GDB 和watchpoint來告訴我們問題是什麼。我們知道, 第一次打印出員工記錄後會發生一些不好的事情, 因此, 讓我們在該點的結構 serial_num 成員上設置一個 watchpoint:

(gdb) break main

Breakpoint 1 at 0x80483e6: file employee.c, line 36.

(gdb) run

Starting program: /home/dbehman/book/code/employee

 

Breakpoint 1, main () at employee.c:36

36       add_employee( &rec, "Fred", 25 );

(gdb) next

38       print_employee_rec( rec );

(gdb) next

Name: Fred

Number: 25

40       printf( "\nUpdating employee's name ...\n\n" );

(gdb) watch rec.serial_num

Hardware watchpoint 2: rec.serial_num

 

It is important to note that GDB was able to successfully engage the assistance of the hardware for this watchpoint. GDB indicates this with the message, “Hardware watchpoint 2...”. If the keyword “Hardware” does not appear, then GDB was unable to use the hardware and defaulted to using a software watchpoint (which is much, much slower). Let’s now continue our program execution and see what happens:

值得注意的是, GDB 能夠成功地爲這個 watchpoint 的硬件提供幫助。消息"硬件 watchpoint 2..."表示硬件支持GDB watchpoint。如果關鍵字 "硬件" 沒有出現, 那麼 GDB 就無法使用該硬件功能, 並且默認使用軟件 watchpoint (這要慢得多)。現在讓我們繼續執行我們的程序, 看看會發生什麼:

(gdb) cont

Continuing.

 

Updating employee's name ...

 

Hardware watchpoint 2: rec.serial_num

 

Old value = 25

New value = 116

0x400a3af9 in strcpy () from /lib/i686/libc.so.6

(gdb) backtrace

#0  0x400a3af9 in strcpy () from /lib/i686/libc.so.6

#1 0x080483af in update_employee_name (rec=0xbffff390, name=0x80485c0

"Fred Smith")

    at employee.c:19

#2  0x08048431 in main () at employee.c:42

 

Bingo! We can see that this program has the infamous buffer overrun bug. The strcpy function does not do any bounds checking or limiting and happily writes past our allotted buffer of eight bytes, which corrupts the next piece of memory occupied by the serial_num structure member.

Bingo!我們可以看到這個程序有臭名昭著的緩衝區溢出 bug。strcpy 函數不做任何邊界檢查或限制, 並愉快地越過了我們分配的緩衝區八字節的界限, 這弄壞了下一塊內存的 serial_num 結構體的成員。

If you have a reproducible problem and you can find the address that gets corrupted, a watchpoint can reduce the investigating time from days (of setting breakpoints or using print statements) to minutes.

如果您有一個可重現的問題, 並且您可以找到損壞的地址, watchpoint 的調查時間 可以從天 減少 (設置斷點或使用打印語句) 到分鐘。

Well, this is great, but what if you don’t have the source code, and/or the program was not built with -g? You can still set hardware watchpoints, but you need to set them directly on an address as in the following example.

嗯, 這是偉大的, 但如果你沒有源代碼, 和/或程序不是用 -g 構建的?您仍然可以設置硬件觀察點, 但您需要直接在地址上設置它們, 如下面的示例所示。

The program is simple and changes the value of a global symbol. We’re using a global symbol because we can easily find the address of that regardless of whether or not the program is built with -g.

該程序很簡單, 並更改了全局變量的值。我們使用的是一個全局變量, 因爲無論程序是否用 -g 構建的, 我們都可以很容易地找到它的地址。

int a = 5 ;

 

int main()

{

 

  a = 6 ;

 

  return 0 ;

}

 

Now, inside of GDB we can find the address of the variable a and set a watchpoint on that address:

現在, 在 GDB 的內部, 我們可以找到變量 a 的地址, 並在該地址設置一個 watchpoint:

(gdb) print &a

$1 = (<data variable, no debug info> *) 0x80493e0

 (gdb) watch (int) *0x80493e0

Hardware watchpoint 1: (int) *134517728

 

The notation here told GDB to watch the contents of the address 0x80493e0 for any changes and to treat the address as an integer. Be sure to dereference the address with a “*,” or GDB will not set the watchpoint correctly. We can now run the program and see the hardware watchpoint in action:

這裏的表示法告訴 GDB 注意地址0x80493e0 的內容的任何更改, 並將地址作爲整數處理。一定要用 "*" 取消引用地址, 否則 GDB 將不會正確設置 watchpoint。現在, 我們可以運行該程序, 並看到在操作中的硬件 watchpoint:

 (gdb) run

Starting program: /home/wilding/src/Linuxbook/watch

Hardware watchpoint 1: (int) *134517728

Hardware watchpoint 1: (int) *134517728

Hardware watchpoint 1: (int) *134517728

Hardware watchpoint 1: (int) *134517728

Hardware watchpoint 1: (int) *134517728

Hardware watchpoint 1: (int) *134517728

 

Old value = 5

New value = 6

0x08048376 in main ()

(gdb)

 

The watchpoint was triggered several times, but in each case the value of the address was not changed, so GDB did not stop the process. Only in the last occurrence did GDB stop the process because the value changed from 5 to 6.

watchpoint 被觸發多次, 但在地址的值沒有改變, 所以 GDB 沒有停止進程。只有在最後一個事件中, GDB 才停止進程, 因爲該地址的 值從5更改爲6。

There are three different types of watchpoints:

有三種不同類型的觀察點:

 

watch - Cause a break in execution for any write to an address

 

rwatch - Cause a break in execution for any read to an address

 

awatch - Cause a break in execution for a read or write to an address

 

Besides the different memory access attributes, the three types of watchpoints can be used in the same way. There are some situations where a read watchpoint can be useful. One example is called a “late read.” A late read is a situation where a code path reads memory after it has been freed. If you know which block of memory is referenced after it has been freed, a read watchpoint can catch the culprit code path that references the memory.

除了不同的內存訪問屬性, 三種類型的觀察點可以使用相同的方式。有些情況下, 讀 watchpoint 是有用的。一個例子叫做 "延遲讀取"。延遲讀取是代碼路徑在釋放後讀取內存的情況。如果知道哪個內存塊 在釋放後又被訪問了, 則讀取 watchpoint 可以捕獲訪問內存的罪魁禍首代碼路徑。

Note: To delete a watchpoint, use the delete command with the watchpoint number as the argument.

注意: 要刪除 watchpoint, 請使用帶有 watchpoint 號的 delete 命令作爲參數。

6.6.5. Display Expression on Stop

Throughout a debugging session, you will find that you will be checking the value of certain variables again and again. GDB provides a handy feature called displays. Displays allow you to tell GDB to display whatever expression you’ve set as a display after each execution stop. To set a display, use the display command. Here is an example:

在整個調試會話中, 您將發現您將反覆檢查某些變量的值。GDB 提供了一個方便的功能稱爲顯示。顯示允許您告訴 GDB 在每次執行停止後顯示您設置爲顯示的任何表達式。要設置顯示, 請使用 "display" 命令。下面是一個示例:

(gdb) display a

(gdb) break main

Breakpoint 1 at 0x8048362

(gdb) run

Starting program: /home/wilding/src/Linuxbook/watch

 

Breakpoint 1, 0x08048362 in main ()

1: {<data variable, no debug info>} 134517792 = 5

 

The last line is the display line, and the display item has a number of 1. To delete this display, we use the delete display command:

最後一行是顯示行, 顯示項的數量爲1。要刪除此顯示, 我們使用 "delete display" 命令:

 (gdb) display

1: {<data variable, no debug info>} 134517792 = 5

(gdb) delete display 1

 

To enable or disable a preset display, use the enable display and disable display commands.

要啓用或禁用預設顯示, 請使用 "enable display" 和 "disable display" 命令。

Note: GDB’s GUI brother, DDD (Data Display Debugger), is perfectly suited for using the concepts of displays. Please refer to the section on DDD for more information on displays.

注意: GDB 的 GUI 兄弟, DDD (數據顯示調試器), 非常適合使用顯示的概念。有關顯示的更多信息, 請參閱 DDD 部分。

6.6.6. Working with Shared Libraries

GDB has a command that will show the shared libraries that a program links in and to see where those libraries have been mapped into the process’ address space. If you get an instruction address, you can use this information to find out which library the instruction is in (and eventually the line of code if you wish). It is also useful to confirm that the program is loading the correct libraries.

GDB 有一個命令, 它將顯示一個程序鏈接到的共享庫, 並查看這些庫映射到進程的地址空間。如果您得到一個指令地址, 則可以使用此信息來查找指令所在的庫 (如果願意的話, 最終是代碼行)。確認程序正在加載正確的庫也很有用。

Use the info sharedlibrary command to see this information:

使用 "信息 sharedlibrary" 命令可以查看此信息:

(gdb) info sharedlibrary

From    To     Syms Read  Shared Object Library

0x40040b40 0x4013b7b4 Yes     /lib/i686/libc.so.6

0x40000c00 0x400139ef Yes     /lib/ld-linux.so.2

(gdb)

 

Shared libraries are like common program extensions. They contain executable code and variables just like an executable, though the libraries can be shared by multiple executables at the same time.

共享庫類似於常見的程序擴展。它們包含可執行代碼和變量, 就像一個可執行文件一樣, 儘管庫可以同時由多個可執行文件共享。

6.6.6.1. Debugging Functions in Shared Libraries

GDB normally does a great job of handling shared libraries that an executable links in. For example, GDB will happily set a breakpoint in a function that exists in a shared library, just as in an executable. There are times, however, when shared libraries get dynamically loaded in an application, which makes it almost impossible for GDB to know what functions could be run before the library is loaded. To illustrate this problem, consider the following two source files:

GDB 通常在處理可執行文件鏈接的共享庫方面做得非常出色。例如, GDB 將愉快地在共享庫中存在的函數中設置斷點, 就像在可執行文件中一樣。但是, 有時在應用程序中動態加載共享庫時, GDB 幾乎不可能知道在加載庫之前可以運行哪些函數。要說明此問題, 請考慮以下兩個源文件:

Code View: Scroll / Show All

                                                                 dyn.c:

 

#include <stdio.h>

 

void func2( void )

{

printf( "This function is not referenced in dyn_main.c\n" );

return;

}

 

void func1( void )

{

printf( "This function is in libdyn.so\n" );

 func2();

 

 return;

}

 

 

dyn_main.c:

 

#include <stdio.h>

#include <dlfcn.h>

 

int main( void )

{

  void *dlhandle = NULL;

  void (*func1_ref)( void );

 

  printf( "Dynamically opening libdyn.so ...\n" );

 

  dlhandle = dlopen( "./libdyn.so", RTLD_NOW );

 

  func1_ref = dlsym( dlhandle, "func1" );

 

  func1_ref();

 

exit:

 

  return 0;

}

 

Now compile these modules with the following commands:

現在, 使用以下命令編譯這些模塊:

penguin> gcc -shared -o libdyn.so dyn.c -g

penguin> gcc -o dyn dyn_main.c -g -ldl

 

Now, in a debugging session let’s say we only wanted to set a breakpoint in func2(). Attempting this after starting GDB with GDB dyn produces this error:

現在, 在調試中, 假設我們只想在 func2 () 中設置斷點。在使用 gdb dyn 啓動 gdb 後嘗試此操作會產生錯誤:

(gdb) break func2

Function "func2" not defined.

 

Using the command that follows, we list the shared libraries that are associated with this executable:

使用下面的命令, 我們列出與此可執行文件關聯的共享庫:

(gdb) info sharedlibrary

No shared libraries loaded at this time.

(gdb) break main

Note: breakpoint 1 also set at pc 0x804841c.

Breakpoint 2 at 0x804841c: file dyn_main.c, line 6.

(gdb) run

Starting program: /home/dbehman/book/code/dyn

 

Breakpoint 1, main () at dyn_main.c:6

6          void *dlhandle = NULL;

(gdb) info sharedlibrary

From        To           Syms Read  Shared Object Library

0x4002beb0  0x4002cde4   Yes        /lib/libdl.so.2

0x40043b40  0x4013e7b4   Yes        /lib/i686/libc.so.6

0x40000c00  0x400139ef   Yes        /lib/ld-linux.so.2

(gdb) break func2

Function "func2" not defined.

(gdb)

 

In the first part of the output, we see that no shared libraries are loaded. This is because the program has not actually started. To get the program running, we set a breakpoint in the main function and run the program using the GDB command, run. When the program is running, we can then see the information about the shared libraries, and as you can see, libdyn.so is not listed. This is why the break func2 attempt failed once again.

在輸出的第一部分, 我們看到沒有加載共享庫。這是因爲程序實際上沒有啓動。爲了使程序運行, 我們在main函數中設置斷點, 然後使用 GDB 命令運行程序。當程序運行時, 我們可以看到有關共享庫的信息, 如您所見, libdyn.so沒有列出。這就是break func2 嘗試再次失敗的原因。

From the preceding source code, we know that libdyn.so will be dynamically loaded as the program runs (using dlopen). This is important because to set a breakpoint in a library that has not been loaded, we need to tell GDB to stop execution when the controlled program loads a new shared library. We can tell GDB to do this with the command set stop-on-solib-events 1. The current state of this flag can be shown with the show stop-on-solib-events command:

從前面的源代碼, 我們知道 libdyn.so將在程序運行時動態加載 (使用 dlopen)。這一點很重要, 因爲要在尚未加載的庫中設置斷點, 我們需要告訴 GDB 在受控程序加載新的共享庫時停止執行。我們可以用命令set stop-on-solib-events 1告訴 GDB 這樣做。此標誌的當前狀態可以與 " show stop-on-solib-events " 命令一起顯示:

(gdb) show stop-on-solib-events

Stopping for shared library events is 0.

(gdb) set stop-on-solib-events 1

(gdb) show stop-on-solib-events

Stopping for shared library events is 1.

(gdb)

 

Now let’s tell GDB to let the program continue:

現在讓我們告訴 GDB 讓程序繼續:

(gdb) cont

Continuing.

Dynamically opening libdyn.so ...

Stopped due to shared library event

(gdb) backtrace

#0  0x4000dd60 in _dl_debug_state_internal () from /lib/ld-linux.so.2

#1  0x4000d7fa in _dl_init_internal () from /lib/ld-linux.so.2

#2  0x4013a558 in dl_open_worker () from /lib/i686/libc.so.6

#3  0x4000d5b6 in _dl_catch_error_internal () from /lib/ld-linux.so.2

#4  0x4013a8ff in _dl_open () from /lib/i686/libc.so.6

#5  0x4002bfdb in dlopen_doit () from /lib/libdl.so.2

#6  0x4000d5b6 in _dl_catch_error_internal () from /lib/ld-linux.so.2

#7  0x4002c48a in _dlerror_run () from /lib/libdl.so.2

#8  0x4002c022 in dlopen@@GLIBC_2.1 () from /lib/libdl.so.2

#9  0x08048442 in main () at dyn_main.c:11

(gdb) break func2

Breakpoint 3 at 0x4001b73e: file dyn.c, line 5.

(gdb)

 

As you can see by the stack trace output, GDB stopped deep inside the dlopen() system call. By that point in time, the symbol table was loaded, and we were able to set a breakpoint in the desired function. We can now choose to continue by issuing the cont command, but we will encounter more stops due to shared library events. Because we’ve accomplished our goal of being able to set a breakpoint in func2(), let’s turn off the stop-on-solib-event flag and then continue:

正如您可以看到的棧跟蹤輸出, GDB在 dlopen () 系統調用時停止。此時, 符號表被加載, 我們能夠在所需函數中設置斷點。現在, 我們可以選擇繼續運行 "continue" 命令, 但是由於共享庫, 我們會遇到更多的停止。因爲我們已經完成了能夠在 func2 () 中設置斷點的目標, 所以讓我們stop-on-solib-event標誌, 然後繼續:

(gdb) set stop-on-solib-events 0

(gdb) cont

Continuing.

This function is in libdyn.so

 

Breakpoint 3, func2 () at dyn.c:5

5         printf( "This function is not referenced in dyn_main.c\n" );

(gdb)

 

Mission accomplished!

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章