go tool pprof与trace的简单使用

news/2024/11/17 23:28:49/

pprof

pprof是Google的程序采样分析工具,其使用分为,采样和分析。Go 语言内置了pprof。
Go 语言库中有两种方式使用pprof,一种是通过net/http/pprof库,另一种是通过runtime/pprof库。

net/http/pprof的简单使用

引入 _ "net/http/pprof"库,开启http://localhost:6060端口的服务,注意其ListenAndServeaddr是可以指定的,handler参数为nil。

package mainimport ("math/rand""net/http"_ "net/http/pprof""runtime""sync""time"
)func main() {runtime.SetMutexProfileFraction(1) // 开启对锁调用的跟踪runtime.SetBlockProfileRate(1)     // 开启对阻塞的跟踪go func() {if err := http.ListenAndServe(":6060", nil); err != nil {panic(err)}}()wg := new(sync.WaitGroup)randFunction(wg)wg.Wait()
}func randFunction(wg *sync.WaitGroup) {rand.Seed(time.Now().UnixNano())for i := 0; i < 100; i++ {wg.Add(1)go func() {defer wg.Done()s := make([]int, rand.Intn(100000))time.Sleep(time.Second * time.Duration(rand.Intn(999999)))_ = s}()}
}

启动程序后,打开浏览器输入:http://localhost:6060/debug/pprof/。可见如下页面:
在这里插入图片描述
配置文件说明:
allocs(分配): 所有过去内存分配的样本
block(阻塞):导致同步基元阻塞的堆栈跟踪
cmdline(命令行程序):当前程序的命令行调用
goroutine(协程信息):当前所有 goroutines 的堆栈跟踪
heap(堆信息):活动对象的内存分配采样。您可以指定 gc GET 参数以在获取堆示例之前运行 GC
mutex(锁信息):争用互斥锁持有者的堆栈跟踪
profile(配置信息):CPU 配置文件。您可以在秒 GET 参数中指定持续时间。获取配置文件后,使用 go 工具 pprof 命令调查配置文件。
threadcreate(线程创建):导致创建新的操作系统线的程堆栈跟踪
trace(跟踪):当前程序执行的跟踪。您可以在秒 GET 参数中指定持续时间。获取跟踪文件后,使用 go 工具跟踪命令调查跟踪
full goroutine stack dump:dump 所有协程的栈信息
蓝色文字前的数字描述了该操作的“具体数量”
蓝色超链接是可以点击的,比如点击:goroutine
跳转到了http://localhost:6060/debug/pprof/goroutine?debug=1链接,这里展示了goroutine的跟踪信息。
在这里插入图片描述
http://localhost:6060/debug/pprof/goroutine?debug=1中结尾的?debug=1去掉,即可下载当前goroutine的跟踪信息到文件,可用具体的分析工具对其进行分析。

go tool pprof 分析工具

-h 选项可以查看帮助信息。
该工具可以静态分析数据文件,也可以动态的网络流分析。
可以看出pprof <format> [options] [binary] <source> ... pprof 其后可跟一个source,这个source可以是从浏览器下载(curl 也可以获取,或使用runtime/pprof生成)的数据问文件,也可以是一个链接

...\example>go tool pprof -h                                       
usage:Produce output in the specified format.                                  pprof <format> [options] [binary] <source> ...Omit the format to get an interactive shell whose commands can be used   
to generate various views of a profilepprof [options] [binary] <source> ...Omit the format and provide the "-http" flag to get an interactive web   
interface at the specified host:port that can be used to navigate through
various views of a profile.                                              pprof -http [host]:[port] [options] [binary] <source> ...Details:Output formats (select at most one):-callgrind       Outputs a graph in callgrind format-comments        Output all profile comments-disasm          Output assembly listings annotated with samples-dot             Outputs a graph in DOT format-eog             Visualize graph through eog-evince          Visualize graph through evince-gif             Outputs a graph image in GIF format-gv              Visualize graph through gv-kcachegrind     Visualize report in KCachegrind-list            Output annotated source for functions matching regexp-pdf             Outputs a graph in PDF format-peek            Output callers/callees of functions matching regexp-png             Outputs a graph image in PNG format-proto           Outputs the profile in compressed protobuf format-ps              Outputs a graph in PS format-raw             Outputs a text representation of the raw profile-svg             Outputs a graph in SVG format-tags            Outputs all tags in the profile-text            Outputs top entries in text form-top             Outputs top entries in text form-topproto        Outputs top entries in compressed protobuf format-traces          Outputs all profile samples in text form-tree            Outputs a text rendering of call graph-web             Visualize graph through web browser-weblist         Display annotated source in a web browserOptions:-call_tree       Create a context-sensitive call tree-compact_labels  Show minimal headers-divide_by       Ratio to divide all samples before visualization-drop_negative   Ignore negative differences-edgefraction    Hide edges below <f>*total-focus           Restricts to samples going through a node matching regexp-hide            Skips nodes matching regexp-ignore          Skips paths going through any nodes matching regexp-intel_syntax    Show assembly in Intel syntax-mean            Average sample value over first value (count)-nodecount       Max number of nodes to show-nodefraction    Hide nodes below <f>*total-noinlines       Ignore inlines.-normalize       Scales profile based on the base profile.-output          Output filename for file-based outputs-prune_from      Drops any functions below the matched frame.-relative_percentages Show percentages relative to focused subgraph-sample_index    Sample value to report (0-based index or name)-show            Only show nodes matching regexp-show_from       Drops functions above the highest matched frame.-source_path     Search path for source files-tagfocus        Restricts to samples with tags in range or matched by regexp-taghide         Skip tags matching this regexp-tagignore       Discard samples with tags in range or matched by regexp-tagleaf         Adds pseudo stack frames for labels key/value pairs at the callstack leaf.-tagroot         Adds pseudo stack frames for labels key/value pairs at the callstack root.-tagshow         Only consider tags matching this regexp-trim            Honor nodefraction/edgefraction/nodecount defaults-trim_path       Path to trim from source paths before search-unit            Measurement units to displayOption groups (only set one per group):granularity-functions       Aggregate at the function level.-filefunctions   Aggregate at the function level.-files           Aggregate at the file level.-lines           Aggregate at the source code line level.-addresses       Aggregate at the address level.sort-cum             Sort entries based on cumulative weight-flat            Sort entries based on own weightSource options:-seconds              Duration for time-based profile collection-timeout              Timeout in seconds for profile collection-buildid              Override build id for main binary-add_comment          Free-form annotation to add to the profileDisplayed on some reports or with pprof -comments-diff_base source     Source of base profile for comparison-base source          Source of base profile for profile subtractionprofile.pb.gz         Profile in compressed protobuf formatlegacy_profile        Profile in legacy pprof formathttp://host/profile   URL for profile handler to retrieve-symbolize=           Controls source of symbol informationnone                  Do not attempt symbolizationlocal                 Examine only local binariesfastlocal             Only get function names from local binariesremote                Do not examine local binariesforce                 Force re-symbolizationBinary                  Local path or build id of binary for symbolization-tls_cert             TLS client certificate file for fetching profile and symbols-tls_key              TLS private key file for fetching profile and symbols-tls_ca               TLS CA certs file for fetching profile and symbolsMisc options:-http              Provide web interface at host:port.Host is optional and 'localhost' by default.Port is optional and a randomly available port by default.-no_browser        Skip opening a browser for the interactive web UI.-tools             Search path for object toolsLegacy convenience options:-inuse_space           Same as -sample_index=inuse_space-inuse_objects         Same as -sample_index=inuse_objects-alloc_space           Same as -sample_index=alloc_space-alloc_objects         Same as -sample_index=alloc_objects-total_delay           Same as -sample_index=delay-contentions           Same as -sample_index=contentions-mean_delay            Same as -mean -sample_index=delayEnvironment Variables:PPROF_TMPDIR       Location for saved profiles (default $HOME/pprof)PPROF_TOOLS        Search path for object-level toolsPPROF_BINARY_PATH  Search path for local binary filesdefault: $HOME/pprof/binariessearches $name, $path, $buildid/$name, $path/$buildid* On Windows, %USERPROFILE% is used instead of $HOME

使用go tool pprof http://localhost:6060/debug/pprof/allocs分析当前应用内存信息。

...\Desktop\example>go tool pprof http://localhost:6060/debug/pprof/allocs   
Fetching profile over HTTP from http://localhost:6060/debug/pprof/allocs
Saved profile in 
...\pprof\pprof.alloc_objects.alloc_space.inuse_objects.inuse_space.002.pb.gz
Type: alloc_space
Time: Jan 22, 2023 at 4:28pm (CST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof)

可以发现,该命令行是交互式的。
输入help 可以查看其支持的所有子命令。

(pprof) helpCommands:callgrind        Outputs a graph in callgrind formatcomments         Output all profile commentsdisasm           Output assembly listings annotated with samplesdot              Outputs a graph in DOT formateog              Visualize graph through eogevince           Visualize graph through evincegif              Outputs a graph image in GIF formatgv               Visualize graph through gvkcachegrind      Visualize report in KCachegrindlist             Output annotated source for functions matching regexppdf              Outputs a graph in PDF formatpeek             Output callers/callees of functions matching regexppng              Outputs a graph image in PNG formatproto            Outputs the profile in compressed protobuf formatps               Outputs a graph in PS formatraw              Outputs a text representation of the raw profilesvg              Outputs a graph in SVG formattags             Outputs all tags in the profiletext             Outputs top entries in text formtop              Outputs top entries in text formtopproto         Outputs top entries in compressed protobuf formattraces           Outputs all profile samples in text formtree             Outputs a text rendering of call graphweb              Visualize graph through web browserweblist          Display annotated source in a web browsero/options        List options and their current valuesq/quit/exit/^D   Exit pprofOptions:call_tree        Create a context-sensitive call treecompact_labels   Show minimal headersdivide_by        Ratio to divide all samples before visualizationdrop_negative    Ignore negative differencesedgefraction     Hide edges below <f>*totalfocus            Restricts to samples going through a node matching regexphide             Skips nodes matching regexpignore           Skips paths going through any nodes matching regexpintel_syntax     Show assembly in Intel syntaxmean             Average sample value over first value (count)nodecount        Max number of nodes to shownodefraction     Hide nodes below <f>*totalnoinlines        Ignore inlines.normalize        Scales profile based on the base profile.output           Output filename for file-based outputsprune_from       Drops any functions below the matched frame.relative_percentages Show percentages relative to focused subgraphsample_index     Sample value to report (0-based index or name)show             Only show nodes matching regexpshow_from        Drops functions above the highest matched frame.source_path      Search path for source filestagfocus         Restricts to samples with tags in range or matched by regexptaghide          Skip tags matching this regexptagignore        Discard samples with tags in range or matched by regexptagleaf          Adds pseudo stack frames for labels key/value pairs at the callstack leaf.tagroot          Adds pseudo stack frames for labels key/value pairs at the callstack root.tagshow          Only consider tags matching this regexptrim             Honor nodefraction/edgefraction/nodecount defaultstrim_path        Path to trim from source paths before searchunit             Measurement units to displayOption groups (only set one per group):granularityfunctions        Aggregate at the function level.filefunctions    Aggregate at the function level.files            Aggregate at the file level.lines            Aggregate at the source code line level.addresses        Aggregate at the address level.sortcum              Sort entries based on cumulative weightflat             Sort entries based on own weight:   Clear focus/ignore/hide/tagfocus/tagignoretype "help <cmd|option>" for more information
(pprof)

使用top,就可以查看所有内存占用信息

(pprof) top
Showing nodes accounting for 46504.27kB, 100% of 46504.27kB total
Showing top 10 nodes out of 34flat  flat%   sum%        cum   cum%
42900.35kB 92.25% 92.25% 42900.35kB 92.25%  main.randFunction.func1902.59kB  1.94% 94.19%  1553.21kB  3.34%  compress/flate.NewWriter650.62kB  1.40% 95.59%   650.62kB  1.40%  compress/flate.(*compressor).init514kB  1.11% 96.70%      514kB  1.11%  bufio.NewWriterSize (inline)512.50kB  1.10% 97.80%   512.50kB  1.10%  runtime.allocm512.20kB  1.10% 98.90%   512.20kB  1.10%  runtime.malg512.01kB  1.10%   100%   512.01kB  1.10%  runtime/pprof.printCountProfile.func10     0%   100%  1553.21kB  3.34%  compress/gzip.(*Writer).Write0     0%   100%  2065.22kB  4.44%  net/http.(*ServeMux).ServeHTTP0     0%   100%  2579.22kB  5.55%  net/http.(*conn).serve
(pprof)

输入top n(1个有效的数字),可以查看前n项内存分配信息

(pprof) top 3
Showing nodes accounting for 44453.56kB, 95.59% of 46504.27kB total
Showing top 3 nodes out of 34flat  flat%   sum%        cum   cum%
42900.35kB 92.25% 92.25% 42900.35kB 92.25%  main.randFunction.func1902.59kB  1.94% 94.19%  1553.21kB  3.34%  compress/flate.NewWriter650.62kB  1.40% 95.59%   650.62kB  1.40%  compress/flate.(*compressor).init
(pprof)

可见,上边的信息最后一列是具体的包名.函数名
使用 list 包名.函数名 可以查看该函数具体内存分配信息

(pprof) list main.randFunction.func1
Total: 45.41MB
ROUTINE ======================== main.randFunction.func1 in 
...\example\main\main.go41.89MB    41.89MB (flat, cum) 92.25% of Total.          .     28:   rand.Seed(time.Now().UnixNano()).          .     29:   for i := 0; i < 100; i++ {.          .     30:           wg.Add(1).          .     31:           go func() {.          .     32:                   defer wg.Done()41.89MB    41.89MB     33:                   s := make([]int, rand.Intn(100000)).          .     34:                   time.Sleep(time.Second * time.Duration(rand.Intn(999999))).          .     35:                   _ = s.          .     36:           }().          .     37:   }.          .     38:}
(pprof)

可以看出导致大量内存分配的罪魁祸首是s := make([]int, rand.Intn(100000)),分配了41.89MB内存

   41.89MB    41.89MB     33:                   s := make([]int, rand.Intn(100000))

输入 web 可以在web端以图的方式展现信息

(pprof) web
failed to execute dot. Is Graphviz installed? Error: exec: "dot": executable file not found in %PATH%
(pprof) 

上边说的是让我们安装Graphviz这个工具。需把该命令加入环境变量的PATH中。
http://www.graphviz.org/该网站中Download即可。
安装时根据个人喜好,选择环境变量。
在这里插入图片描述
此时重启一次命令行,并重新输入

...\example>go tool pprof http://localhost:6060/debug/pprof/allocs
Fetching profile over HTTP from http://localhost:6060/debug/pprof/allocs
Saved profile in 
...\pprof\pprof.alloc_objects.alloc_space.inuse_objects.inuse_space.005.pb.gz
Type: alloc_space
Time: Jan 22, 2023 at 4:46pm (CST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) web
(pprof)

便可查看具体函数内存占用。同理还可以查看其他指标的信息。
在这里插入图片描述
节点颜色:
红色:累计值cum为正,并且很大
绿色:累计值cum为负,并且很大
灰色:cum可以忽略不计
节点字体大小:
较大的字体表示当前值较大
较小的字体表示当前值较小
边框颜色:
红色:当前值较大并且为正
绿色:当前值较小并未为负
灰色:当前值接近0
箭头大小:
箭头越粗:当前路径消耗的资源越多
箭头越细:当前路径消耗的资源越少
箭头线性:
虚线:两个节点间某些节点可以被忽略,为间接调用
实现:两个节点间为直接调用

在url中可以使用second指定采样的时间:http://localhost:6060/debug/pprof/trace?seconds=5,采样时间为5秒。

使用tree命令可以查找到函数的调用链

(pprof) tree
Showing nodes accounting for 44816.50kB, 100% of 44816.50kB total
----------------------------------------------------------+-------------flat  flat%   sum%        cum   cum%   calls calls% + context----------------------------------------------------------+-------------
43791.80kB 97.71% 97.71% 43791.80kB 97.71%                | main.randFunction.func1
----------------------------------------------------------+-------------512.50kB   100% |   runtime.newm512.50kB  1.14% 98.86%   512.50kB  1.14%                | runtime.allocm
----------------------------------------------------------+-------------512.20kB   100% |   runtime.newproc1512.20kB  1.14%   100%   512.20kB  1.14%                | runtime.malg
----------------------------------------------------------+-------------0     0%   100%   512.50kB  1.14%                | runtime.mstart512.50kB   100% |   runtime.mstart0
----------------------------------------------------------+-------------512.50kB   100% |   runtime.mstart0     0%   100%   512.50kB  1.14%                | runtime.mstart0512.50kB   100% |   runtime.mstart1
----------------------------------------------------------+-------------512.50kB   100% |   runtime.mstart00     0%   100%   512.50kB  1.14%                | runtime.mstart1512.50kB   100% |   runtime.schedule
----------------------------------------------------------+-------------512.50kB   100% |   runtime.startm0     0%   100%   512.50kB  1.14%                | runtime.newm512.50kB   100% |   runtime.allocm
----------------------------------------------------------+-------------512.20kB   100% |   runtime.systemstack0     0%   100%   512.20kB  1.14%                | runtime.newproc.func1512.20kB   100% |   runtime.newproc1
----------------------------------------------------------+-------------512.20kB   100% |   runtime.newproc.func10     0%   100%   512.20kB  1.14%                | runtime.newproc1512.20kB   100% |   runtime.malg
----------------------------------------------------------+-------------512.50kB   100% |   runtime.schedule0     0%   100%   512.50kB  1.14%                | runtime.resetspinning512.50kB   100% |   runtime.wakep
----------------------------------------------------------+-------------512.50kB   100% |   runtime.mstart10     0%   100%   512.50kB  1.14%                | runtime.schedule512.50kB   100% |   runtime.resetspinning
----------------------------------------------------------+-------------512.50kB   100% |   runtime.wakep0     0%   100%   512.50kB  1.14%                | runtime.startm512.50kB   100% |   runtime.newm
----------------------------------------------------------+-------------0     0%   100%   512.20kB  1.14%                | runtime.systemstack512.20kB   100% |   runtime.newproc.func1
----------------------------------------------------------+-------------512.50kB   100% |   runtime.resetspinning0     0%   100%   512.50kB  1.14%                | runtime.wakep512.50kB   100% |   runtime.startm
----------------------------------------------------------+-------------
(pprof)

tree runtime.newm

(pprof) tree runtime.newm
Active filters:focus=runtime.newm
Showing nodes accounting for 512.50kB, 1.14% of 44816.50kB total
----------------------------------------------------------+-------------flat  flat%   sum%        cum   cum%   calls calls% + context
----------------------------------------------------------+-------------512.50kB   100% |   runtime.newm512.50kB  1.14%  1.14%   512.50kB  1.14%                | runtime.allocm
----------------------------------------------------------+-------------0     0%  1.14%   512.50kB  1.14%                | runtime.mstart512.50kB   100% |   runtime.mstart0
----------------------------------------------------------+-------------512.50kB   100% |   runtime.mstart0     0%  1.14%   512.50kB  1.14%                | runtime.mstart0512.50kB   100% |   runtime.mstart1
----------------------------------------------------------+-------------512.50kB   100% |   runtime.mstart00     0%  1.14%   512.50kB  1.14%                | runtime.mstart1512.50kB   100% |   runtime.schedule
----------------------------------------------------------+-------------512.50kB   100% |   runtime.startm0     0%  1.14%   512.50kB  1.14%                | runtime.newm512.50kB   100% |   runtime.allocm
----------------------------------------------------------+-------------512.50kB   100% |   runtime.schedule0     0%  1.14%   512.50kB  1.14%                | runtime.resetspinning512.50kB   100% |   runtime.wakep
----------------------------------------------------------+-------------512.50kB   100% |   runtime.mstart10     0%  1.14%   512.50kB  1.14%                | runtime.schedule512.50kB   100% |   runtime.resetspinning
----------------------------------------------------------+-------------512.50kB   100% |   runtime.wakep0     0%  1.14%   512.50kB  1.14%                | runtime.startm512.50kB   100% |   runtime.newm
----------------------------------------------------------+-------------512.50kB   100% |   runtime.resetspinning0     0%  1.14%   512.50kB  1.14%                | runtime.wakep512.50kB   100% |   runtime.startm
----------------------------------------------------------+-------------
(pprof)

火焰图

go tool 也内置了火焰图分析工具。
使用
go tool pprof -http :6061 http://localhost:6060/debug/pprof/profile?second=30
在收集30秒信息后会自动打开浏览器,显示火焰图。
在这里插入图片描述
在VIEW中选择Flame Graph。查看火焰图。
在这里插入图片描述
火焰图说明:
root代表程序的开始,单击每个函数可以查看具体信息。
其他框分别代表一个函数,每一层的函数都为平级的,下一层函数被上一层函数所调用。
函数调用栈越长,火焰越明亮。
框越长,颜色越深,代表该函数CPU时间占用越久。

trace 事件跟踪的简单使用

trace可以获取更加详细的调用链信息。
获取30秒trace信息:http://localhost:6060/debug/pprof/trace?second=30
通过浏览器下载该文件。
切换到文件所在路径,执行go tool trace trace最后的trace为刚刚下载的文件名。
程序自动打开了浏览器:
在这里插入图片描述
点击View trace,查看trace信息:
在这里插入图片描述

Reference
《Go 语言底层原理剖析》


http://www.ppmy.cn/news/18744.html

相关文章

7628 uboot启动流程

7628 uboot启动流程图 uboot启动过程如图&#xff1a; Created with Raphal 2.3.0开始OP1:第1次搬头部代码到内存地址1OP2:检查内存地址1中的厂商自定义头部结构确认厂商头完整并获取实际大小&#xff1f;OP3:根据大小第2次搬完整代码到内存地址1OP4:跳到内存地址1头大小偏移量…

火鸟出击指标

主图叠加指标} {火鸟出击} AM1:C/REF(C,1)>1.095 AND CH; H1:(REF(C,1)/REF(C,2)<0.96) OR (REF(O,1)/REF(C,1)>1.05); H2:C/REF(C,1)>1.041 AND C>REF(O,1); XGH1:H1 AND H2; XGY:XGH1; ABXG:COUNT(REF(AM1,3),10) AND XGY; DRAWICON(ABXG,L0.98,24); DRAWTEXT…

树状数组(Binary Indexed Tree (B.I.T))

树状数组 树状数组 (Binary Indexed Tree(B.I.T), Fenwick Tree) 是一个查询和修改复杂度都为 log(n) 的数据结构。 「前缀和查询」与「单点更新」 直接前驱&#xff1a;c[i] 的直接前驱为 c[i - lowbid(i)]&#xff0c;即 c[i] 左侧紧邻的子树的根。 直接后继&#xff1a;c[i…

promise对象

ES6 Promise 对象 1. Promise 的含义 Promise 是异步编程的一种解决方案&#xff0c;比传统的解决方案——回调函数和事件——更合理和更强大。它由社区最早提出和实现&#xff0c;ES6 将其写进了语言标准&#xff0c;统一了用法&#xff0c;原生提供了 Promise 对象。 所谓…

论文投稿指南——中文核心期刊推荐(水利工程)

【前言】 🚀 想发论文怎么办?手把手教你论文如何投稿!那么,首先要搞懂投稿目标——论文期刊 🎄 在期刊论文的分布中,存在一种普遍现象:即对于某一特定的学科或专业来说,少数期刊所含的相关情报量很大,而多数期刊的情报量却很少;也就是说,世界上大量的科学论文集中…

String 与 StringBuffer 与 StringBuilder 各自的妙用

String 与 StringBuffer 与 StringBuilder 各自的妙用 每博一文案 我从未见过&#xff0c;一个早起&#xff0c;勤奋&#xff0c;谨慎&#xff0c;诚实的人&#xff0c;抱怨命运不好的。 最完美的状态&#xff0c;不是你从不失误&#xff0c;而是你从没放弃成长。没人能把你变…

第二章 IOC

1.IOC底层原理*什么是IOC&#xff1a;控制反转&#xff0c;把对象创建和对象之间的调用过程交给Spring进行管理*使用IOC的目的&#xff1a;降低耦合度*IOC底层原理&#xff1a;xml解析工厂模式反射*IOC思想基于IOC容器完成&#xff0c;IOC容器底层就是对象工厂*Spring提供IOC容…

【Python语言基础】——Python Lambda

Python语言基础——Python Lambda 文章目录 Python语言基础——Python Lambda一、Python Lambda一、Python Lambda lambda 函数是一种小的匿名函数。 lambda 函数可接受任意数量的参数,但只能有一个表达式。 语法 lambda arguments : expression 执行表达式并返回结果: 实例…