Can I mitigate the disk usage caused by building multiple C++ modules executables in a project?

Hi C++ modules fans,

I have a couple of repositories that are now completely C++ modules based, using a cmake/ninja build system.

My best example is this maths library: GitHub - sebsjames/maths: C++ modules for scalar, vector and complex math. And maths. · GitHub

This library contains multiple test programs that I can compile and run with ctest. There are about 100 short test programs.

One result of using a C++ modules build process is that each executable that uses ‘moduleA.cppm’ will compile its own copy of moduleA. This means that I have to compile moduleA multiple times, conferring a compute penalty (which I can live with; it’s part of the design of modules). However, it also means that there are many temporary build files created, conferring a storage penalty. It turns out that after building all the tests for this relatively small library, about 18 GB of build files accumulate! With a modules build, building all the tests consumes absurd amount of storage · Issue #132 · sebsjames/maths · GitHub

My question is whether CMake has some kind of semi-cleanup process, that would allow me to keep just the binaries for the many test programs, but get rid of temporary build files as the build processes, keeping the disk usage relatively small?

Correction: it’s 6 GB (gcc) to 8 GB (clang) of temporary files, not 18 GB. Still rather a lot of storage. In my other project (mathplot) the ~100 example programs there generate >50 GB of temporary files!

One way I could fix this would be to create a test framework in which I collect all 100 small programs into a single one, but a) it would be nice not to have to do that work and b) in the mathplot case, I really do want those examples to be separate programs.

Would add_custom_command be an approach?

add_custom_command(TARGET prog1 POST_BUILD
    COMMAND "rm selected temporary files" 
)

I am experimenting with

add_custom_command(
  TARGET prog1
  POST_BUILD COMMAND rm -rf ./CMakeFiles/prog1.dir  
  COMMENT "Clean up time"
)

This seems to work fine, but it introduces a non-portability issue, which is that it requires that the build system is Unix.

It also, not surprisingly, removes the ability to re-call ninja to re-build, with an error like:

[14:48:36 bclang20] ninja 
ninja: error: '/home/seb/src/maths/bclang20/tests/CMakeFiles/range_intersects.dir/CXXDependInfo.json', needed by 'tests/CMakeFiles/range_intersects.dir/range_intersects.cpp.o.modmap', missing and no known rule to make it

Ok, so here’s my own solution. I make a macro called my_add_executable that I can use in place of add_executable. This adds an add_custom_command call which uses the cmake -E remove command for cross-platform temporary file removal.

If we limit the cleanup to just the *.pcm (if building with clang) and *.gcm (if building with gcc) files after build, we free up most of the temporary storage that was used by the modules build process. These are the largest temporary build files. After removing them, you can still re-call ninja to rebuild your executables. Instead of saying “nothing to do” it does of course need to regenerate those .pcm/gcm files, but that’s fine; the aim here is to build test programs once, without using up a lot of storage (which can be an issue when running on a continuous integration virtual machine).

macro (my_add_executable my_target my_sourcefile)
  add_executable(${my_target} ${my_sourcefile})
  add_custom_command(
    TARGET ${my_target}
    POST_BUILD
    COMMAND ${CMAKE_COMMAND}
    ARGS -E remove ./CMakeFiles/${my_target}.dir/*.pcm
    ARGS -E remove ./CMakeFiles/${my_target}.dir/*.gcm
    COMMENT "Clean pcm/gcm files for ${my_target}"
  )
endmacro()

my_add_executable(range_adding range_adding.cpp)
add_test(range_adding range_adding)
target_sources_modules(range_adding MODULES ${SM_VEC_MODULES})

One last task is to find out what the equivalent to clang’s pcm and gcc’s gcm files are for Visual Studio.