Valgrind使用指南和錯誤分析

 

Valgrind是一個GPL的軟件,用於LinuxFor x86, amd64 and ppc32)程序的內存調試和代碼剖析。你可以在它的環境中運行你的程序來監視內存的使用情況,比如C 語言中的mallocfree或者 C++中的newdelete。使用Valgrind的工具包,你可以自動的檢測許多內存管理和線程的bug,避免花費太多的時間在bug尋找上,使得你的程序更加穩固。

Valgrind的主要功能
Valgrind
工具包包含多個工具,如Memcheck,Cachegrind,Helgrind, CallgrindMassif。下面分別介紹個工具的作用:

Memcheck 工具主要檢查下面的程序錯誤:

使用未初始化的內存 (Use of uninitialised memory)
使用已經釋放了的內存 (Reading/writing memory after it has been free’d)
使用超過 malloc分配的內存空間(Reading/writing off the end of malloc’d blocks)
對堆棧的非法訪問 (Reading/writing inappropriate areas on the stack)
申請的空間是否有釋放 (Memory leaks – where pointers to malloc’d blocks are lost forever)
malloc/free/new/delete
申請和釋放內存的匹配(Mismatched use of malloc/new/new [] vs free/delete/delete [])
src
dst的重疊(Overlapping src and dst pointers in memcpy() and related functions)
Callgrind
Callgrind
收集程序運行時的一些數據,函數調用關係等信息,還可以有選擇地進行cache 模擬。在運行結束時,它會把分析數據寫入一個文件。callgrind_annotate可以把這個文件的內容轉化成可讀的形式。

Cachegrind
它模擬 CPU中的一級緩存I1,D1L2二級緩存,能夠精確地指出程序中 cache的丟失和命中。如果需要,它還能夠爲我們提供cache丟失次數,內存引用次數,以及每行代碼,每個函數,每個模塊,整個程序產生的指令數。這對優化程序有很大的幫助。

Helgrind
它主要用來檢查多線程程序中出現的競爭問題。Helgrind 尋找內存中被多個線程訪問,而又沒有一貫加鎖的區域,這些區域往往是線程之間失去同步的地方,而且會導致難以發掘的錯誤。Helgrind實現了名爲” Eraser” 的競爭檢測算法,並做了進一步改進,減少了報告錯誤的次數。

Massif
堆棧分析器,它能測量程序在堆棧中使用了多少內存,告訴我們堆塊,堆管理塊和棧的大小。Massif能幫助我們減少內存的使用,在帶有虛擬內存的現代系統中,它還能夠加速我們程序的運行,減少程序停留在交換區中的機率。

Valgrind 安裝
1
、 到www.valgrind.org下載最新版valgrind-3.2.3.tar.bz2
2
、 解壓安裝包:tar –jxvf valgrind-3.2.3.tar.bz2
3
、 解壓後生成目錄valgrind-3.2.3
4
cd valgrind-3.2.3
5
./configure6Make;make install

注意:不要移動Valgrind到一個與--prefix指定的不一樣的目錄,這將導致一些莫名其妙的錯誤,大多數在Valgrind處理/fork/exec調用時。

1.檢查內存錯誤:
例如我們原來有一個程序sec_infod,這是一個用gcc –g參數編譯的程序,運行它需要:
#./a.out
如果我們想用valgrind的內存檢測工具,我們就要用如下方法調用:
#valgrind --leak-check=full --show-reachable=yes --trace-children= yes   ./a.out (2>logfile
加上會好些,程序在執行期間stderr會有一些輸出。提示比較多)

其中--leak-check=full 指的是完全檢查內存泄漏,--show-reachable=yes是顯示內存泄漏的地點,--trace-children=yes是跟入子進程。

如果您的程序是會正常退出的程序,那麼當程序退出的時候valgrind自然會輸出內存泄漏的信息。如果您的程序是個守護進程,那麼也不要緊,我們 只要在別的終端下殺死memcheck進程(因爲valgrind默認使用memcheck工具,就是默認參數—tools=memcheck):
#killall memcheck
這樣我們的程序(./a.out)就被kill

2,檢查代碼覆蓋和性能瓶頸:
我們調用valgrind的工具執行程序:
#valgrind --tool=callgrind ./sec_infod

會在當前路徑下生成callgrind.out.pid(當前生產的是callgrind.out.19689),如果我們想結束程序,可以:
#killall callgrind
然後我們看一下結果:
#callgrind_annotate --auto=yes callgrind.out.19689   >log
#vim log

3.Valgrind使用參數
          --log-fd=N
默認情況下,輸出信息是到標準錯誤stderr,也可以通過—log-fd=8,輸出到描述符爲8的文件
          --log-file=filename
將輸出的信息寫入到filename.PID的文件裏,PID是運行程序的進行ID。可以通過--log- file exactly=filename指定就輸出到filename文件。
          --log-file-qualifier=<VAR>,
取得環境變量的值來做爲輸出信息的文件名。如—log-file- qualifier=$FILENAME
          --log-socket=IP:PORT
也可以把輸出信息發送到網絡中指定的IP:PORT
          --error-limit=no
對錯誤報告的個數據進行限制,默認情況不做限制
          --tool=<toolname> [default: memcheck]
--tool=memcheck
:要求用memcheck這個工具對程序進行分析
     --leak-ckeck=yes
要求對leak給出詳細信息
     --trace-children=<yes|no> [default: no]
跟蹤到子進程裏去,默認請況不跟蹤
     --xml=<yes|no> [default: no]
將信息以xml格式輸出,只有memcheck可用
     --gen-suppressions=<yes|no|all> [default: no]
如果爲yesvalgrind會在每發現一個錯誤便停下讓用戶做選擇是繼續還是退出

更多選項請參看: http://www.valgrind.org/docs/manual/manual-core.html可以把一些默認選項編輯在 ~/.valgrindrc文件裏。

這裏使用valgrindmemcheckcallgrind兩個工具的用法,其實valgrind還有幾個工具:“cachegrind”,用於檢查緩存使用的;“helgrind”用於檢測多線程競爭資源的,等等。

錯誤分析

1.默認使用工具memcheck

2.輸出到XML文件:valgrind --leak-check=full --xml=yes --log-file="log.xml" myprog arg1 arg2

3.錯誤解釋

3.1Illegal read / Illegal write errors

例如:

Invalid read of size 4
at 0x40F6BBCC: (within /usr/lib/libpng.so.2.1.0.9)
by 0x40F6B804: (within /usr/lib/libpng.so.2.1.0.9)
by 0x40B07FF4: read_png_image(QImageIO *) (kernel/qpngio.cpp:326)
by 0x40AC751B: QImageIO::read() (kernel/qimage.cpp:3621)
Address 0xBFFFF0E0 is not stack'd, malloc'd or free'd
這個錯誤的發生是因爲對一些memcheck猜想不應該訪問的內存進行了讀寫。 3.2 Use of uninitialised values

例如:

Conditional jump or move depends on uninitialised value(s)
at 0x402DFA94: _IO_vfprintf (_itoa.h:49)
by 0x402E8476: _IO_printf (printf.c:36)
by 0x8048472: main (tests/manuel1.c:8)
這個錯誤的發生是因爲使用了未初始化的數據。一般情況下有兩種情形容易出現這個錯誤:
程序中的局部變量未初始化;
C
語言malloc的內存未初始化;C++new的對象其成員未被初始化。

3.3 Illegal frees
例如:
Invalid free()
at 0x4004FFDF: free (vg_clientmalloc.c:577)
by 0x80484C7: main (tests/doublefree.c:10)
Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd
at 0x4004FFDF: free (vg_clientmalloc.c:577)
by 0x80484C7: main (tests/doublefree.c:10)

3.4 When a block is freed with an inappropriate deallocation function
例如:
Mismatched free() / delete / delete []
at 0x40043249: free (vg_clientfuncs.c:171)
by 0x4102BB4E: QGArray::~QGArray(void) (tools/qgarray.cpp:149)
by 0x4C261C41: PptDoc::~PptDoc(void) (include/qmemarray.h:60)
by 0x4C261F0E: PptXml::~PptXml(void) (pptxml.cc:44)
Address 0x4BB292A8 is 0 bytes inside a block of size 64 alloc'd
at 0x4004318C: operator new[](unsigned int) (vg_clientfuncs.c:152)
by 0x4C21BC15: KLaola::readSBStream(int) const (klaola.cc:314)
by 0x4C21C155: KLaola::stream(KLaola::OLENode const *) (klaola.cc:416)
by 0x4C21788F: OLEFilter::convert(QCString const &) (olefilter.cc:272)

  • If allocated with malloc, calloc, realloc, valloc or memalign, you must deallocate with free.

  • If allocated with new[], you must deallocate with delete[].

  • If allocated with new, you must deallocate with delete.

linux系統對上述錯誤可能不在意,但是移值到其他平臺時卻會有問題。

3.5 Passing system call parameters with inadequate read/write permissions


例如:
Syscall param write(buf) points to uninitialised byte(s)
at 0x25A48723: __write_nocancel (in /lib/tls/libc-2.3.3.so)
by 0x259AFAD3: __libc_start_main (in /lib/tls/libc-2.3.3.so)
by 0x8048348: (within /auto/homes/njn25/grind/head4/a.out)
Address 0x25AB8028 is 0 bytes inside a block of size 10 alloc'd
at 0x259852B0: malloc (vg_replace_malloc.c:130)
by 0x80483F1: main (a.c:5)

Syscall param exit(error_code) contains uninitialised byte(s)
at 0x25A21B44: __GI__exit (in /lib/tls/libc-2.3.3.so)
by 0x8048426: main (a.c:8)
Memcheck
檢查所有的被系統調用的參數。

  • It checks all the direct parameters themselves.

Also, if a system call needs to read from a buffer provided by your program, Memcheck checks that the entire buffer is addressable and has valid data, ie, it is readable.

Also, if the system call needs to write to a user-supplied buffer, Memcheck checks that the buffer is addressable.

例如:

#include <stdlib.h>
#include <unistd.h>
int main( void )
{
char* arr = malloc(10);
int* arr2 = malloc(sizeof(int));
write( 1 /* stdout */, arr, 10 );
exit(arr2[0]);
}

錯誤信息:

Syscall param write(buf) points to uninitialised byte(s)
at 0x25A48723: __write_nocancel (in /lib/tls/libc-2.3.3.so)
by 0x259AFAD3: __libc_start_main (in /lib/tls/libc-2.3.3.so)
by 0x8048348: (within /auto/homes/njn25/grind/head4/a.out)
Address 0x25AB8028 is 0 bytes inside a block of size 10 alloc'd
at 0x259852B0: malloc (vg_replace_malloc.c:130)
by 0x80483F1: main (a.c:5)
Syscall param exit(error_code) contains uninitialised byte(s)
at 0x25A21B44: __GI__exit (in /lib/tls/libc-2.3.3.so)
by 0x8048426: main (a.c:8)

傳遞了無效參數到系統函數中。

3.6 Overlapping source and destination blocks

C的以下庫函數拷貝數據從一塊內存到另一塊內存時: memcpy(), strcpy(), strncpy(), strcat(), strncat(). 源和目的都不允許溢出。

例如:

==27492== Source and destination overlap in memcpy(0xbffff294, 0xbffff280, 21)
==27492== at 0x40026CDC: memcpy (mc_replace_strmem.c:71)
==27492== by 0x804865A: main (overlap.c:40)

 

3.7 Memory leak detection

錯誤信息:

Still reachable: A pointer to the start of the block is found. This usually indicates programming sloppiness. Since the block is still pointed at, the programmer could, at least in principle,free it before program exit. Because these are very common and arguably not a problem, Memcheck won't report such blocks unless --show-reachable=yes is specified.

 

Possibly lost, or "dubious": A pointer to the interior of the block is found. The pointer might originally have pointed to the start and have been moved along, or it might be entirely unrelated. Memcheck deems such a block as "dubious", because it's unclear whether or not a pointer to it still exists.

 

Definitely lost, or "leaked": The worst outcome is that no pointer to the block can be found. The block is classified as "leaked", because the programmer could not possibly have freed it at program exit, since no pointer to it exists. This is likely a symptom of having lost the pointer at some earlier point in the program.

 

發佈了20 篇原創文章 · 獲贊 12 · 訪問量 16萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章