CPU profiler
GoLand collects and visualizes CPU profiles, traces, and heap profiles. To collect all the necessary data, GoLand uses the pprof package. GoLand includes four profilers that you can run from the user interface: CPU, memory, blocking (contention), and mutex.
Profiling results help you locate performance issues, but code improvements must be implemented manually. For more information, see the Profiling at go.dev and the description of the pprof package at pkg.go.dev.
After the analysis is complete, the profiler visualizes the results in reports.
Before you start
Before running a profile, make sure that:
Go is installed, you can install, upgrade, or configure Go by using the GOROOT article. For more information, refer to GOROOT.
GoLand is installed on your machine.
The Go project you want to profile is open in the IDE.
All examples in this topic are available in a sample project on GitHub.
CPU profiling
The CPU profiler measures how much CPU time each function consumes during program execution.
Example program
The following program sorts a slice of random integers using an inefficient bubble sort algorithm:
You can run this program using go run main.go by selecting the Run option from the gutter menu.

Create a test for profiling
Create a unit test that runs the sorting function:
Run CPU profiling
Open the _test.go file.
Click the Run option from the gutter menu next to the test function.
Select Profile with CPU Profiler.

Analyze CPU profiling results
GoLand presents CPU profiling data in three views:
Flame graph: visualizes how CPU time is distributed across functions.
The Flame Graph tab shows function calls and the percentage of time each call takes to execute. Each block represents a function in the stack (a stack frame). The Y-axis shows the stack depth (bottom-up), while the X-axis represents functions sorted by CPU usage, from the most to the least resource-consuming.
When reading the flame graph, focus on the widest blocks — they represent functions that consume the most CPU time. Hover over any block to view detailed information.

Call tree: displays how functions call one another and how much time each call takes.
The Call Tree tab provides detailed information about the program’s call stacks sampled during profiling. It includes:
Method names
Percentage of total sample time (can be toggled to show parent call time)
Total sample count
Number of filtered calls

Method list: provides a tabular view of all functions with cumulative execution time and total CPU usage percentage.
The Method List tab lists all methods found in the profiling data, sorted by cumulative sample time. Each method entry includes a Back Trace tree and a Merge Callees tree.
The Back Traces tab displays the hierarchy of callers, showing which methods invoke the selected one. The Merge Callees view summarizes all methods called by the selected function.
The Merged Callees tab shows call traces that started from the selected method. Callee List is the method list summarizing the methods down the call hierarchy.

In this example, most time is spent in the nested loop of BubbleSort, which indicates an O(n²) bottleneck.
Optimize and compare results
To improve performance, replace bubble sort with quicksort, which is significantly more efficient, offering an average time complexity of O(n log n). In the test, increase the number of random integers to 10 million to ensure that the sorting process takes a measurable amount of time — otherwise, the algorithm will complete almost instantly (around 0.01 seconds).
When you rerun the CPU profiler, the program finishes much faster and consumes significantly less CPU time — a clear indication of improved performance.

Benchmark the optimized code
To confirm the improvement, benchmark both implementations:
Benchmark results show that Quicksort() completes in a fraction of the time compared to BubbleSort().
