Starting with Golang profiling!!
Starting with Golang profiling!!

Starting with Golang profiling!!

What is profiling?

“It is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls. Most commonly, profiling information serves to aid program optimization, and more specifically, performance engineering”

https://en.wikipedia.org/wiki/Profiling_(computer_programming)

Profiling GO programs is a three-step process:

  • Capture profiling data
  • Visualizing captured data
  • Analyzing captured data

A Profile is a collection of stack traces showing the call sequences that led to instances of a particular event, such as allocation.

Profiling

GO provides multiple ways to do profiling.

runtime/pprof

The runtime/pprof package allows the user to profile CPU and memory. In addition to the existing profiles, one may define their own custom profiles.

pprof.StartCPUProfile(f) and pprof.WriteHeapProfile(f) allows to add profiling support to standalone programs.

https://pkg.go.dev/runtime/pprof

Enabling CPU/Heap profiling in your program is as simple as adding the below snippet to your main()

func main() {

	// create a file to save the profile
	fcpu, err := os.Create("cpu.profile")
	defer fcpu.Close()

	// start the cpu profiler
	pprof.StartCPUProfile(fcpu)

        // stop cpu profiler on function exit
	defer pprof.StopCPUProfile() 

        .....
	
        // create a file to save the profile
	fmem, err := os.Create("heap.profile")
	defer fmem.Close()   

        runtime.GC() // get up-to-date statistics
        pprof.WriteHeapProfile(fmem)     
}

net/http/pprof

The “net/http/pprof” package serves via its HTTP server, runtime profiling data, in the format expected by the pprof visualization tool.

There are a number of default debug endpoints which are already made available by the pprof package:

goroutine    - stack traces of all current goroutines
heap         - a sampling of all heap allocations
threadcreate - stack traces that led to the creation of new OS threads
block        - stack traces that led to blocking on synchronization primitives
mutex        - stack traces for holders of contended mutexes

https://pkg.go.dev/net/http/pprof

Enable profiling by adding the following snippet to your main()

import (
	...
	"net/http"
	_ "net/http/pprof"

)

func main() {

	//start the profiling server
	go func() {
		log.Println(http.ListenAndServe("localhost:6060", nil))
	}()

       ......
}

Note that if you have an already running server in your program, you don’t need to start the server at all. The pprof package will register its handlers with DefaultServerMux.

runtime

The package runtime allows the user to collect details at much finer granularity. It interacts with GO’s runtime system, and as such allows to control the runtime environment variables, for e.g. using SetMaxStack function, once can limit the maximum amount of stack space used by a single goroutine.

Although it is much easier to create the profiles using the pprof package, runtime allows for greater controls.

import (
  "log"
  "runtime"
)

func main() {
  var mem runtime.MemStats

  runtime.ReadMemStats(&mem)

  log.Println(mem.Alloc)
  log.Println(mem.TotalAlloc)
  log.Println(mem.HeapAlloc)
  log.Println(mem.HeapSys)
  .....
}

pkg/profile

Dave Cheney has developed a package named profile which can be the easiest to try.

https://github.com/pkg/profile

Adding profiling code is as simple as adding the required lines from below excerpt.

// CPUProfile: Enable cpu profiling
defer profile.Start(profile.CPUProfile).Stop()

// GoroutineProfile: Enable goroutine profiling
defer profile.Start(profile.GoroutineProfile).Stop()

// BlockProfile: Enable block profiling
defer profile.Start(profile.BlockProfile).Stop()

// ThreadcreationProfile: Enable thread creation profiling
defer profile.Start(profile.ThreadcreationProfile).Stop()

// MemProfileHeap: Enable heap profiling
defer profile.Start(profile.MemProfileHeap).Stop()

// MemProfileAllocs: Enable profiling of allocations
defer profile.Start(profile.MemProfileAllocs).Stop()

// MutexProfile: Enable mutex profiling
defer profile.Start(profile.MutexProfile).Stop()

Visualizing

Having captured the profiles, the next task is to present them in a comprehendible format.

Here also GO provides several ways to do the job.

Local Analysis

go tool pprof can be used to analyze the generated profile files.

It provides multiple ways to visualize your output ranging from an interactive shell to web interface.

Example of an interactive session on a CPU profile:

Opening a graphical interface:

Note: You need to have graphviz (https://graphviz.org/) installed for web visualizations.

Remote analysis

In case you are running a pprof server, the debug endpoints can be used to monitor your program remotely. The debug endpoints are available at:

http://<ip>:6060/debug/pprof/

Each of the endpoints will give divulge further details about the state of the program. For e.g. the heap endpoint, shows the following details:

Through these profiles, we can troubleshoot the problem areas of our code, be it memory allocations, cpu usage, latencies etc.

Analyzing

Now that we are ready with our profiles and their different visualization tools, let try to make some sense from the visual outputs.

Interactive

Run “go tool pprof <profile>” to enter the interactive mode.

The top command can be used to sort the output according in descending order of cpu time.

This allows us to pinpoint functions that are using most of the CPU.

Web Interface

If you are running a pprof server, you can directly connect to the debug points. For example, to view the total memory allocations run:

go tool pprof http://<ip&gt;:<port>/debug/pprof/allocs

Moving On…

Once you have identified the function, you can do a list to get further information on the function.

pprof also has options of displaying the information in various graph styles. For eg. below shown is the callgraph view of the above memory allocation results, achieved by giving the web command instead of the top command.

Interpreting the Callgraph (pprof/README.md at main · google/pprof · GitHub)

  • Node Color:
    • large positive cum values are red.
    • large negative cum values are green
    • cum values close to zero are grey.
  • Node Font Size:
    • larger font size means larger absolute flat values.
    • smaller font size means smaller absolute flat values.
  • Edge Weight:
    • thicker edges indicate more resources were used along that path.
    • thinner edges indicate fewer resources were used along that path.
  • Edge Color:
    • large positive values are red.
    • large negative values are green.
    • values close to zero are grey.
  • Dashed Edges: some locations between the two connected locations were removed.
  • Solid Edges: one location directly calls the other.
  • “(inline)” Edge Marker: the call has been inlined into the caller.

Hope this gives a good starting point for profiling in GO.

A very good read for understanding the profiling and optimizations approach

https://go.dev/blog/pprof

Discover more from Open Knowledge

Subscribe now to keep reading and get access to the full archive.

Continue reading