google開源的C++性能分析工具 - gperftools

原創

Lily莉莉

2020-02-21 06:22

gperftools是Google提供的一套工具，其中的一個功能是CPU profiler，用於分析程序性能，找到程序的性能瓶頸。

安裝

gperftools：http://code.google.com/p/gperftools/downloads/list

libunwind：http://download.savannah.gnu.org/releases/libunwind/

64位操作系統需要安裝libunwind，官方推薦版本是libunwind-0.99-beta

安裝過程：./configure [--disable-shared] &&make && make install

Graphviz是一個由AT&T實驗室啓動的開源工具包，用於繪製DOT語言腳本描述的圖形，gperftools依靠此工具生成圖形分析結果。

安裝命令：yum install graphviz

1.編譯libunwind庫

因爲使用的是X86_64的Linux系統，因此需要安裝libunwind庫。

安裝方法很簡單，常見的configure,make,make install的套路。

wget http://download.savannah.gnu.org/releases/libunwind/libunwind-0.99-beta.tar.gz

tarxvzf libunwind-0.99-beta.tar.gz

cd libunwind-0.99-beta

./configure

make

makeinstall

因爲默認的libunwind安裝在/usr/local/lib目錄下，需要將這個目錄添加到系統動態庫緩存中。

echo "/usr/local/lib" > /etc/ld.so.conf.d/usr_local_lib.conf
/sbin/ldconfig

libunwind的最新版本是1.0.1，那爲什麼不選擇最新版本呢？google perftools的INSTALL文件中給了說明。版本低於0.99-beta的libunwind與preftools一起工作可能不正常，但是高於0.99-beta的版本中可能包含一些與perftools不兼容的代碼(因爲libunwind會調用malloc，可能會導致死鎖)。libunwind在X86_64平臺上和perftools有不少問題，不過不會影響核心的tcmalloc庫，但是會影響perftools中的工具，例如cpu-profiler,heap-checker,heap-profiler。

2.編譯google-perftools

因爲我們只需要tcmalloc功能，因此不編譯google-perftools中的其他工具。

wget http://gperftools.googlecode.com/files/google-perftools-1.9.1.tar.gz

tarxvzf google-perftools-1.9.1.tar.gz

cd google-perftools-1.9.1

./configure --disable-cpu-profiler --disable-heap-profiler --disable-heap-checker --enable-minimal--disable-dependency-tracking

make

makeinstall

/sbin/ldconfig

用法

1.目標程序中引入頭文件<google/profiler.h>，鏈接libprofiler庫，64位操作系統同時鏈接libunwind庫，在需要分析代碼的起點和終點調用ProfilerStart()函數和ProfilerStop()函數

2.編譯鏈接，運行程序

分析輸出

pprof腳本用於分析profile文件並輸出結果，包括文本和圖形兩種輸出風格。

例如：demo是目標程序，my.prof是profile文件

生成文本風格結果：pprof --text ./demo my.prof >profile.txt

生成圖形風格結果：pprof --pdf ./demo my.prof > profile.pdf

對於一個函數的CPU使用時間分析，分爲兩個部分：

1.整個函數消耗的CPU時間，包括函數內部其他函數調用所消耗的CPU時間

2.不包含內部其他函數調用所消耗的CPU時間（內聯函數除外）

關於文本風格輸出結果

序號	說明
1	分析樣本數量（不包含其他函數調用）
2	分析樣本百分比（不包含其他函數調用）
3	目前爲止的分析樣本百分比（不包含其他函數調用）
4	分析樣本數量（包含其他函數調用）
5	分析樣本百分比（包含其他函數調用）
6	函數名

關於圖形風格輸出結果

1.節點

每個節點代表一個函數，節點數據格式：

Class Name

Method Name

local (percentage)

of cumulative (percentage)

local時間是函數直接執行的指令所消耗的CPU時間（包括內聯函數）；性能分析通過抽樣方法完成，默認是1秒100個樣本，一個樣本是10毫秒，即時間單位是10毫秒；

cumulative時間是local時間與其他函數調用的總和；

如果cumulative時間與local時間相同，則不打印cumulative時間項。

2.有向邊

調用者指向被調用者，有向邊上的時間表示被調用者所消耗的CPU時間

示例

代碼如下，可以看出，CPU消耗集中在func1()和func2()兩個函數，func2()消耗時間約爲func1()的兩倍。

#include <google/profiler.h>

#include <iostream>

using namespace std;

void func1() {

int i = 0;

while (i < 100000) {

++i;

}

void func2() {

int i = 0;

while (i < 200000) {

++i;

}

void func3() {

for (int i = 0; i < 1000; ++i) {

func1();

func2();

}

int main(){

ProfilerStart("my.prof"); // 指定所生成的profile文件名

func3();

ProfilerStop(); // 結束profiling

return 0;

}

然後編譯鏈接運行，使用pprof生成分析結果

g++-o demo demo.cpp -lprofiler -lunwind

pprof--text ./demo my.prof > output.txt

pprof--pdf ./demo my.prof > output.pdf

查看分析結果，程序是122個時間樣本，其中，func1()是40個時間樣本，約爲400毫秒；func2()是82個時間樣本，約爲820毫秒。

Total: 122 samples

82 67.2% 67.2% 82 67.2% func2

40 32.8% 100.0% 40 32.8% func1

0 0.0% 100.0% 122 100.0% __libc_start_main

0 0.0% 100.0% 122 100.0% _start

0 0.0% 100.0% 122 100.0% func3

0 0.0% 100.0% 122 100.0% main

Lily莉莉

發佈了9 篇原創文章 · 獲贊 0 · 訪問量 2萬+

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

google開源的C++性能分析工具 - gperftools

mysql慢查詢和緩存命中

性能調優攻略

JVM優化配置及中間件配置例子

loadrunner性能測試腳本設計常用方法

時間戳引起的網站訪問不了的問題

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結