Complile C++ for Profiling

I want to compile my C++ project (using CMAKE)
I want to use all the perfomance optimaizations of -DCMAKE_BUILD_TYPE=Release
Except striping it of debug linkining (to allow the profiler to record the function calls with names)

I am using GCC 10.2 profilers: perf and callgrind

I would use RelWithDebInfo. I think it is usually -O2 rather than -O3, but it keeps debugging information. For profiling, you’ll need to manually add other flags like -fprofile-arcs and (probably) other compiler-specific flags (for gprof). perf and callgrind generally work with a plain RelWithDebInfo build (at least that’s how I use them).

1 Like

Callgrind works with:

cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo .. && make && valgrind --tool=callgrind ./testWarehouse -c ../config.yaml

but perf does not

cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo .. && make && perf record -g ./testWarehouse -c ../config.yaml

That seems like some optimization getting tracked. You could try decreasing the optimization level perhaps? There are probably some -f and/or -g flags which can keep more debugging information around for perf.


I do use RelWithDebInfo and compile the program with the -g flag using CMAKE_CXX_FLAGS "-g" however, I am unable to see the function name or the source code in KCachegrind.

How can I set the Cmake flags so as to achieve the same

Your target function might be inlined and as such not have any function name actually associated with it. More information is needed to determine what might be the cause here.

Thank you for the quick response !
Sorry for the lack of information, I had completely missed it.

My Cmake is as follows -

cmake_minimum_required(VERSION 3.10)

set(CMAKE_BUILD_TYPE RelWithDebInfo)

#set(BLA_VENDOR Intel10_64lp)

#find_package(BLAS REQUIRED)

add_executable(test main.cpp)
#target_link_libraries(test ${BLAS_LIBRARIES})

and my main.cpp is as follows -

#define min(x,y) (((x) < (y)) ? (x) : (y))

#include <iostream>

using namespace std;

double min(double x, double y){
	return x < y ? x : y ;

int runBlas(){

double a = 34234.39082344;
double b = 12133231.3402;

double c = min(a, b);

return int(c);


int main()
        return 0;

and I call the Valgrind as follows-

valgrind --tool=callgrind ./test

I even try

valgrind --tool=callgrind --dsymutil=yes ./test

If this is what you’re actually doing this is what compiler explorer shows with -O2 -g:

min(double, double):
        minsd   xmm0, xmm1
        mov     eax, 34234
        xor     eax, eax
_GLOBAL__sub_I_min(double, double):
        sub     rsp, 8
        mov     edi, OFFSET FLAT:_ZStL8__ioinit
        call    std::ios_base::Init::Init() [complete object constructor]
        mov     edx, OFFSET FLAT:__dso_handle
        mov     esi, OFFSET FLAT:_ZStL8__ioinit
        mov     edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev
        add     rsp, 8
        jmp     __cxa_atexit

To sum up, nothing is being called and (almost) all your functions reduced to returning a constant. There’s no profile to show for this.

Thank you for your response @fenrir

For a better example code, the -g flag should be enough for it to show me a call graph and everything with my function names (and source code as well) in KCachegrind?

Yes, -g is suitable, though the optimization level may interfere and end up inlining bits of code.

Note that CMake already passes this along in the RelWithDebInfo and Debug build configurations, so there’s no need to set the flag manually.

Oh okay. Actually I have large codebase for which I would like to generate a callgraph with all the debugging symbols, it would become much easier to familiarize myself with it and I will try it again.

I set build type as Debug and and added the -g flag but it did not work.
BTW, I build the project using catkin_make, that should not have an effect right ?

The Debug build type already has -g in its flag list, so there’s no need to add it manually.


As long as it can process POSIX Make recipes, sure.

I suppose it can, I will try it out, Thanks a lot @ben.boeckel for your assistance