dotTrace Web Help

An application can be profiled in several ways. dotTrace Performance provides three profiling methods:

  • sampling
  • tracing
  • line-by-line

Profiling_Guidelines__Choosing_the_Right_Profiling_Method__Title_01

What is sampling? Sampling is a process or technique of taking samples. A sample is a set of call stacks taken during a profiling session. That leads us to two obvious questions: (1) how long is the pause between two given samples and (2) how much time does it take to get a sample. The answers to these questions can help us estimate the accuracy of the sampling method.

dotTrace Performance captures call stacks of all existing threads within the process, sequentially without pauses. It also takes into account threads that are locked or sleeping. The time required for capturing a call stack cannot be precisely determined because it depends on the stack depth and the number of native and managed stack frames. Therefore, the time required to take a sample necessarily varies from sample to sample and depends on the number of currently running threads.

dotTrace Performance makes pauses between taking samples. The pause is the time gone by after dotTrace stops processing thread activities for the previous sample and before it starts processing again for the next sample. The length of each pause is a random value between 5 and 11 milliseconds. Random values help decrease the probability of having gaps in call stacks. During such pauses application continues running normally.

One consequence of this is that, since the time between samples is at least 5 milliseconds, methods that run quickly enough may not be caught and shown in a snapshot. However, this does not prevent dotTrace from getting the correct time data. Two situations are possible. If a method is fast and is called many times, it will be caught and shown in a snapshot. If a method is fast, but is called rarely, then it may be omitted in a snapshot, but its time will be included in total time of its parent. In other words, if the total time of a method is significant, it will be counted.

All in all, this profiling method provides time data that helps reveal problem call stacks, but it fails to provide numbers of function calls. Still this method is the fastest and can be a solid first step to localize performance problems.

Profiling_Guidelines__Choosing_the_Right_Profiling_Method__Title_02

Unlike sampling, tracing revolves around a function, or more precisely, around function entry and exit.

dotTrace Performance receives notifications from CLR when a function is entered and then when it is left, even if it is left because of an exception. The time between these two notifications is considered as the execution time of the function.

On the one hand, you get all functions that were not inlined by JIT compiler and were executed at that point in time , in a snapshot with their detailed timing data. On the other hand, JIT compiler generates a specific prologue and epilogue for each function and that takes some extra time for CLR to execute such pieces of code. dotTrace does not count and subtract this time from total function time. As a result, total time might be distorted. The degree of distortion depends on the number of function calls. The dependency is linear. The more times a function is called, the bigger the distortion becomes. And the less time a function executes, the less accurate its total time can be. For example, you have a very simple function Inc() { _value++; }, but it is called millions of times. Of course it can be optimized and will take little time anyway. However, if it runs under dotTrace Performance and the tracing method has been chosen, each call of this function adds some overhead which can be much more than the real function execution time. As a result, total time can be more than it could be after using the sampling method or without using the profiler.

Another overhead may be caused by CLR. CLR provides different kinds of optimizations. Depending on CLR version and the chosen profiling method some optimizations may be disabled or done in a different way, so the results may differ.

On the whole, you always get the correct number of function calls, but the total function time may be inaccurate. Because tracing takes more time than sampling and may also slow down your application significantly, it is better to profile individual parts of an application or certain scenarios.

Profiling_Guidelines__Choosing_the_Right_Profiling_Method__Title_03

This method is similar to tracing, but here the target of investigation is a statement, not a function. In order to profile a function line by line, dotTrace Performance requires PDB files. If you do not have the corresponding PDB files, the method works as tracing.

dotTrace measures the time required to execute a statement and how many times it is executed. As you can likely imagine, this method is even slower than tracing because dotTrace performs time-counting work for each statement.

Line-by-line is an effective method after you have narrowed the scope of investigation and want to concentrate on certain functions.

Taking everything into account, we can summarize the following table.

Method Pros Cons
Sampling
  • Time required to run an application under profiler does not change significantly
  • Small snapshot
  • Low memory usage
  • Number of calls for a function is undefined
  • Not all call stacks and functions are captured
Tracing
  • All call stacks and functions are captured, except inlined functions
  • Number of calls is defined correctly
  • More time required to run an application under profiler
  • Snapshot may be quite large
  • Dependencies between time distortions and number of function calls
  • Higher memory usage
Line-by-line
  • Possibility to study function in detail, statement level
  • More time required to run an application under profiler, compared to tracing
  • Snapshot may be quite large
  • Higher memory usage compared to tracing
  • PDB files are required

To demonstrate the differences between sampling and tracing profiling methods, let's take a simple application that recursively traverses a tree. Each node of the tree contains a full file path or full directory path. During the application run we check whether a path matches a specific pattern or not.

The main point is to see the difference in application execution time or establish that there is no difference.

To calculate the time passed in real world during the profiling process, we have added code that gets the current date and time, and subtracts start time from end time. The time is not constant. It depends on operating system, CPU load, etc. The results may change from run to run, so it is better to take average time. Consider the results below.

Conditions Average time Ratio
Normal run, without profiler 8910 milliseconds
Under profiler, sampling 9043 milliseconds 1.015
Under profiler, tracing 17426 milliseconds 1.956

Based on these results, we can make the following conclusions. First, the program execution slows down a bit when we use the sampling method, and the time doubles when we use the tracing method.

Conditions Average time Ratio
Normal run, without profiler 207159 milliseconds
Under profiler, sampling 209427 milliseconds 1.011
Under profiler, tracing 418768 milliseconds 1.999

Second, the difference in time increases with the number of nodes that should be traversed. The time may double or triple. While this may be a drawback, you can also look at it as a trade-off between time and accuracy.

Compare the screenshots. On the first one you can see the IsMatch function. The function runs fast, but it still takes 25 milliseconds. On the second screenshot no function can be found, because the samples were taken between function executions. In this example, it is not essential to have this function in both snapshots. However, it illustrates the common idea about the profiling methods: if the amount of time required to execute a function is less than 5 milliseconds, that function may be omitted.
Profiling_Guidelines__Choosing_the_Right_Profiling_Method__Tracing
Profiling_Guidelines__Choosing_the_Right_Profiling_Method__Sampling

Summary

Use line-by-line, if you know exactly what function causes problems. This method helps you understand how the function executes.

Use tracing, if you want to see numbers of function calls, or if information provided by sampling is not enough.

Use sampling in all other situations. It is recommended to use sampling if you look for performance problems in your application for the first time.