What is profiling?
“It is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls. Most commonly, profiling information serves to aid program optimization, and more specifically, performance engineering”
https://en.wikipedia.org/wiki/Profiling_(computer_programming)
Profiling GO programs is a three-step process:
- Capture profiling data
- Visualizing captured data
- Analyzing captured data
A Profile is a collection of stack traces showing the call sequences that led to instances of a particular event, such as allocation.
Profiling
GO provides multiple ways to do profiling.
runtime/pprof
The runtime/pprof package allows the user to profile CPU and memory. In addition to the existing profiles, one may define their own custom profiles.
pprof.StartCPUProfile(f) and pprof.WriteHeapProfile(f) allows to add profiling support to standalone programs.
https://pkg.go.dev/runtime/pprof
Enabling CPU/Heap profiling in your program is as simple as adding the below snippet to your main()
func main() {
// create a file to save the profile
fcpu, err := os.Create("cpu.profile")
defer fcpu.Close()
// start the cpu profiler
pprof.StartCPUProfile(fcpu)
// stop cpu profiler on function exit
defer pprof.StopCPUProfile()
.....
// create a file to save the profile
fmem, err := os.Create("heap.profile")
defer fmem.Close()
runtime.GC() // get up-to-date statistics
pprof.WriteHeapProfile(fmem)
}
net/http/pprof
The “net/http/pprof” package serves via its HTTP server, runtime profiling data, in the format expected by the pprof visualization tool.
There are a number of default debug endpoints which are already made available by the pprof package:
goroutine - stack traces of all current goroutines
heap - a sampling of all heap allocations
threadcreate - stack traces that led to the creation of new OS threads
block - stack traces that led to blocking on synchronization primitives
mutex - stack traces for holders of contended mutexes
https://pkg.go.dev/net/http/pprof
Enable profiling by adding the following snippet to your main()
import (
...
"net/http"
_ "net/http/pprof"
)
func main() {
//start the profiling server
go func() {
log.Println(http.ListenAndServe("localhost:6060", nil))
}()
......
}
Note that if you have an already running server in your program, you don’t need to start the server at all. The pprof package will register its handlers with DefaultServerMux.
runtime
The package runtime allows the user to collect details at much finer granularity. It interacts with GO’s runtime system, and as such allows to control the runtime environment variables, for e.g. using SetMaxStack function, once can limit the maximum amount of stack space used by a single goroutine.
Although it is much easier to create the profiles using the pprof package, runtime allows for greater controls.
import (
"log"
"runtime"
)
func main() {
var mem runtime.MemStats
runtime.ReadMemStats(&mem)
log.Println(mem.Alloc)
log.Println(mem.TotalAlloc)
log.Println(mem.HeapAlloc)
log.Println(mem.HeapSys)
.....
}
pkg/profile
Dave Cheney has developed a package named profile which can be the easiest to try.
https://github.com/pkg/profile
Adding profiling code is as simple as adding the required lines from below excerpt.
// CPUProfile: Enable cpu profiling
defer profile.Start(profile.CPUProfile).Stop()
// GoroutineProfile: Enable goroutine profiling
defer profile.Start(profile.GoroutineProfile).Stop()
// BlockProfile: Enable block profiling
defer profile.Start(profile.BlockProfile).Stop()
// ThreadcreationProfile: Enable thread creation profiling
defer profile.Start(profile.ThreadcreationProfile).Stop()
// MemProfileHeap: Enable heap profiling
defer profile.Start(profile.MemProfileHeap).Stop()
// MemProfileAllocs: Enable profiling of allocations
defer profile.Start(profile.MemProfileAllocs).Stop()
// MutexProfile: Enable mutex profiling
defer profile.Start(profile.MutexProfile).Stop()
Visualizing
Having captured the profiles, the next task is to present them in a comprehendible format.
Here also GO provides several ways to do the job.
Local Analysis
go tool pprof can be used to analyze the generated profile files.
It provides multiple ways to visualize your output ranging from an interactive shell to web interface.
Example of an interactive session on a CPU profile:

Opening a graphical interface:


Remote analysis
In case you are running a pprof server, the debug endpoints can be used to monitor your program remotely. The debug endpoints are available at:
http://<ip>:6060/debug/pprof/

Each of the endpoints will give divulge further details about the state of the program. For e.g. the heap endpoint, shows the following details:

Through these profiles, we can troubleshoot the problem areas of our code, be it memory allocations, cpu usage, latencies etc.
Analyzing
Now that we are ready with our profiles and their different visualization tools, let try to make some sense from the visual outputs.
Interactive
Run “go tool pprof <profile>” to enter the interactive mode.
The top command can be used to sort the output according in descending order of cpu time.
This allows us to pinpoint functions that are using most of the CPU.

Web Interface
If you are running a pprof server, you can directly connect to the debug points. For example, to view the total memory allocations run:
go tool pprof http://<ip>:<port>/debug/pprof/allocs

Moving On…
Once you have identified the function, you can do a list to get further information on the function.

pprof also has options of displaying the information in various graph styles. For eg. below shown is the callgraph view of the above memory allocation results, achieved by giving the web command instead of the top command.

Interpreting the Callgraph (pprof/README.md at main · google/pprof · GitHub)
- Node Color:
- large positive cum values are red.
- large negative cum values are green
- cum values close to zero are grey.
- Node Font Size:
- larger font size means larger absolute flat values.
- smaller font size means smaller absolute flat values.
- Edge Weight:
- thicker edges indicate more resources were used along that path.
- thinner edges indicate fewer resources were used along that path.
- Edge Color:
- large positive values are red.
- large negative values are green.
- values close to zero are grey.
- Dashed Edges: some locations between the two connected locations were removed.
- Solid Edges: one location directly calls the other.
- “(inline)” Edge Marker: the call has been inlined into the caller.
Hope this gives a good starting point for profiling in GO.
A very good read for understanding the profiling and optimizations approach
You must be logged in to post a comment.