Using cmake/ninja to batch cl invocations

I’m trying to port a msbuild based project to cmake/ninja, and it seems like the ability to invoke the compiler with multiple source files is missing(?)

Is there a way to have cmake generate ninja build files which will wind up executing:
cl /MP source1.cpp source2.cpp … sourceN.cpp
instead of:
cl source1.cpp
cl source2.cpp

cl sourceN.cpp

https://cmake.org/cmake/help/v3.18/prop_tgt/UNITY_BUILD.html

A “unity” build creates a source file source_all.cpp which #includes all source1…N.cpp within it, then only the resulting source_all.cpp is passed to compiler. That is not what I want (msvc does not parallelize this case). As original question states, I’d like the effective command line to list out all the source files. GCC (and others) should need this support too, no?

This would need support in Ninja, not CMake.

How does it compare to the build parallelism that Ninja already does by default?


This has a good overview, but it seems the ninja feature was never completed.

The general idea is that cl will create some number of static subprocesses (essentially a pool of worker processes), and the pool can gain speedup via eliminating process startup/teardown overhead, and reducing duplicated work (such as parsing the same includes for every source file). Note there are (simple) rules the build system must abide by: any source files given on the same command line to cl necessarily share the same command line options. So, it is advantageous for the build system to be aware of the parallelism cl is using such that it could invoke multiple cl instances which internally all have some degree of parallelism. (e.g. your cpu has 64threads, and you have 4 groups of source files with unique commandlines consisting of <= 16 files each, to reach maximum performance you can schedule all of the groups simultaneously).

AFAIK the only mode of operation in Ninja is that it starts a cl process for each input source file, where the number of concurrent processes is controlled by the machine’s thread count or -j.

p.s. I’m not sure why that feature writeup mentions /showIncludes (or exactly what Ninja is doing internally with the info), but on more recent versions of msvc, there is the /sourceDependencies option, which is a more robust implementation. I’ve used it to implement something like ccache’s direct-mode and it works quite well.

AFAIK, MSBuild is the only build system that knows how to leverage the /MP flag of cl.

You can use MSBuild with CMake by using one of the Visual Studio generators.

FWIW, I build a large project both with MSBuild+cl /MP and Ninja. Ninja is always faster (for my case), even though it doesn’t use /MP. What I experience is:

  1. MSBuild scheduler seems to not start compilation of projects until all dependent projects are finished linking. Ninja is more intelligent in this regard and will happily start compilations of depending projects while dependents are also compiling.
  2. MSBuild’s own parallelism (/m: switch) fights with cl.exe spawning up threads internally via /MP. Without careful hand-tuning, it tends to way oversusbcribe the machine’s cpus. Ninja keeps all the cpus busy without oversubscribing them and causing mouse stutter on the machine in general like I’ve experienced with MSBuild.

HTH.

1 Like

Yes, @Shaun-Cox has explained the problem.

However, in my case, projects are already very well fine-tuned for parallelism with msbuild (i.e. it will batch the entire compilation phase into only a few invocations of cl.exe - and this batching really does help a lot overall). So this results in a tradeoff when migrating to ninja. While ninja does a much better job at “project-level” parallelization (because of the linking stall as pointed out by @Shaun-Cox), it apparently lacks ability to batch invocations of cl.

Why can’t we have the best of both worlds? :slight_smile: Have ninja also batch cl invocations.

That’s not for CMake to decide :wink: