Linux 下 profile 技巧

用 profiler 來查看程序最耗時的部分, 以便在正確的地方優化是很重要的.

zoom
一個圖形化的 profiler, 非常簡單, 而且好用, 入門教程, 注意編譯的時候加上 ‘-O2 -g’ 參數以開啓優化並保留符號列表. 最簡單的用法就是啓動 zoom 後點擊 start, 然後馬上運行你要 profile 的程序, 程序運行結束後點擊 stop. profile 結果顯示爲 tree 型, 每個函數的執行時間和其調用關係都顯示的很清楚, 還可以看源代碼和彙編代碼, 並給出優化提示.
Gprof
1. 編譯參數(CXXFLAGS)和鏈接參數(LDFLAGS)都加上 “-pg” 標誌;
2. 正常運行程序, 速度比沒加 -pg 參數時慢, 默認會輸出一個 gmon.out 文件
3. 運行 gprof, 以第二步程序的可執行文件作爲其參數, 重定向輸出:
```
$ gprof myprog > profile.txt
```
4. 分析 profile.txt 文件, profile.txt 文件中包含兩個表: flat profile, call graph. flat profile 顯示的是每個函數的總共執行時間和調用次數等信息. call graph顯示的時函數調用關係和時間, 在做優化時非常有用.
valgrind 工具集
- memcheck: 檢測內存泄露, 編譯時加 -g 參數並且不要開啓 -O 優化. valgrind 默認工具就是 memcheck.
```
 valgrind --leak-check=yes  myprog arg1 arg2
```
  出錯信息查詢: http://valgrind.org/docs/manual/mc-manual.html#mc-manual.errormsgs
- cachegrind: 分析程序 cache 命中和分支預測情況, 使用這個工具時最好打開 -O 優化.
```
valgrind --tool=cachegrind --branch-sim=yes prog
```
  詳細信息會輸出到 cachegrind.out.<pid> 文件中, 可由 --cachegrind-out-file 選項改變輸出文件. 然後用
```
cg_annotate --auto=yes <filename>
```
  分析輸出文件. 也可以使用圖形化工具 KCachegrind.
- callgrind
  
  --tool=callgrind 使用 -g -O2 參數
- massif
  
  profile heap memory, 用 massif-visualizer查看圖形化結果.
perf
Linux 自帶的性能分析工具。
perf list 列出採樣事件
perf stat [<options>] [<command>]
perf stat [<options>] [-p pid] | [-t tid] 以非常精簡的方式列出程序運行的整體數據，包括 page faluts, branch misses，cpu利用率，進程切換次數等。可以針對已運行的進程和線程。
perf top 監控系統所有進程
perf record – e cpu-clock – g [<command>] 統計程序各個部分的運行事件
perf report 生成perf record命令的結果報表
gcov

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Linux 下 profile 技巧

Linux 下 profile 技巧

Wireshark 安裝+使用（一）

TCO14 2B L3: AlwaysDefined，math，從餘數入手

whitespace language, 全部由空白字符組成的編程語言

神經網絡之優化算法

主流 CTR 模型綜述

LLVM Pass 初探

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結