Profiling a C/C++ code with Valgrind + KCachegrind

First let’s explain what is profiling. In general terms, profiling is a technique that can be used when you experience performance problems in your application. Basically, it consists in measure the spent time of each code function, to identify where is the bottleneck, i.e., the lines of the code that concentrate more the execution time.

There is a lot of tools for this proposal. In Java we have JPerformance and that are very nice. For C/C++, one that is very used is the GNU gprof, which is very simple to use (just recall to compile your code with -pg flag). Recently I knew KCachegrind, which is a cache profiler based on Valgrind, that surprised me by the simplicity of its QT interface. Valgrind, for those who does not know, is not a profiling tool, it is a memory management tool that helps you to find bugs due to memory leaks, memory conflicts. It is very recommended to use Valgrind since the beginning of coding. Let’s explain a little bit more of KCachegrind.

You can install it by apt-get repostitory:

user@host:~$ sudo apt-get install valgrind kcachegrind

For demonstration, I use a database simulator written in C (by me and others) as a target application. The simulator repeat the database operation thousand of times and compute the average of some measured metrics. The normal execution takes hours, so I run a simple experiment with valgrind (which makes the execution much slower). Before call kcachegrind is necessary to call valgrind to profile the cache data.

user@host:~$ valgrind --tool=callgrind ./simulator --param param.txt --queries 10 --seed 65270
user@host:~$ kcachegrind callgrind.out.12208

I took some screenshots to highlights some KCachegrind features:

A first feature provided by Kcachegrind, showed in the first picture, is a table with the cumulative cost of each function. It consider that main() has 100% of cost and we can follow how this cost is distributed along the functions called by main(). The same feature, explained above, can be viewed in a graphic view, as depicted in the second figure. And another way to analyse the code is through the graph view, which starts in main(), showing the cumulative cost and walk through the code graph. On this graph we can views that the higher cost of the code is concentrated on vprintf function, which is used to log the simulator execution for debug purposes.

See ya!