Skip dependency checks in CMake + Ninja builds

gshipman · April 27, 2021, 7:36pm

Hello,

We have a very large code that uses CMake + Ninja which can take quite some time to build. We sometimes find ourselves doing printf style debugging within the code which can trigger rebuilding of large portions of the code. Our legacy Perl+Makefile based build system has an option to skip all dependency checks for this use-cases, build the single file that changed and then do the link.

Is there a way of accomplishing this in CMake + Ninja?

Thanks,

Galen

ben.boeckel · April 27, 2021, 9:22pm

I don’t think so. There was talk somewhere about porting the /fast targets that the Makefiles generators make (which basically skip all dependency checking but are in the “keep both pieces” bucket when the build becomes inconsistent because of it), but I can’t find it right now.

craig.scott · April 27, 2021, 11:00pm

Are you just running ninja or are you telling ninja to build just the target you’re interested in? You want to do the latter to minimise the amount of rebuilds while doing your dev iterations.

If you have lots of libraries involved in the target you are looking at, using shared libraries may speed things up. You could consider setting the LINK_DEPENDS_NO_SHARED target property to true, or more likely set it globally with the CMAKE_LINK_DEPENDS_NO_SHARED variable. That may reduce the amount of things that get rebuilt when you make changes.

If using static or object libraries, the OPTIMIZE_DEPENDENCIES target property might or might not help as well. Again, more likely you’d use the CMAKE_OPTIMIZE_DEPENDENCIES variable to enable this globally. This requires CMake 3.19 or later.

cferenba · April 28, 2021, 5:43pm

To add a few more details to @gshipman’s original question:

We have a large code base that is mostly Fortran 90 (and later) source files, that makes heavy use of modules. If a change is made to a Fortran source file that defines a module, then technically, any other source depending on that module file (directly or indirectly) needs to recompile. This can lead to a huge cascade of recompiles, especially if the original change is at a low level in the dependency tree. What we’d like to do is avoid all of those recompiles in the case of a trivial change that is known to not affect the generated module file, such as adding write statements for debugging.

We would still need to relink all of the binaries that depend on that source file, but we’re okay with the relinking steps since there generally aren’t nearly as many of those as there are of the recompiles. So I don’t think the LINK_DEPENDS_NO_SHARED or the other suggestions above will help with the problem we’re trying to solve. (If we were talking about C/C++ source file changes instead, those would be more relevant.)

ben.boeckel · April 28, 2021, 6:12pm

One fix here is to get the compiler to either not rewrite a file with the same contents (in the vein of cmake -E copy_if_different). However, this requires restat = 1 and is generally not all that practical in the real world.

I think better is to instead enhance ninja here:

Getting CMake to help generate a build.ninja file that knows which edges need recompiled based on “trivial” changes is not trivial. We’re likely going to only be able to get “use the full dep tree” or “use no dep tree” (meaning manual relinking if needed) in practice.

cferenba · April 28, 2021, 7:19pm

@ben.boeckel Yes, this looks much more like what we need. I’ll go look more closely at the issues you linked.

And to clarify: we’re not looking for CMake to automatically decide whether a change is trivial, but to provide a setting that the user can set manually when (and only when) it’s appropriate.

gshipman · April 28, 2021, 7:24pm

@ben.boeckel as @cferenba indicates, we would be happy to have the ability to rebuild a single target and then force a relink of the application if that were an option (cmake --link)…

ben.boeckel · April 28, 2021, 7:43pm

Ninja provides no build-time configuration or logic for such things. This would require a cmake -Dsome_flag . reconfigure to regenerate the build.ninja for any specific dependency weakening.

CMake does not, in general, know how to link a target. It knows how to generate the project files, but the actual link link is not known in Xcode and Visual Studio generators (the Makefiles and Ninja generators obviously do know the link line). I think the best CMake can do is provide /fast targets which ignore all non-object (or something; I forget the actual line that /fast cuts at) dependencies and then users can call ninja -j1 lib1/fast lib2/fast lib3/fast exe/fast (-j1 because otherwise those are going to run in arbitrary order and may not update properly).

Tyler_Reddy · August 22, 2021, 9:55pm

So, just to make sure I understand the full picture of the current state of Fortran tooling:

there is currently no generator that CMake can leverage (even Unix Makefiles) that is capable of “determining” that the interface presented by a Fortran module file has not changed (even if all you do is touch a source.f90 file without changing its contents)?

I was able to cut the rebuild time down by 9 minutes (!) by using a script to cache the mtime value of every single source and build artifact file, rebuilding a single target, restoring the mtimes of all source/build files from the “cache”, and then re-running CMake (ninja) which apparently does the final re-link.

My general impression from looking at the comments and PRs related to the ninja issues linked above is that they are not particularly receptive to adding any kind of “decision-making” to their process–they seem quite happy to be a fast/nimble program and depending primarily on “mtime” rather than the contents of source files for example, instead delegating more complex decisions to the external programs that generate ninja files.

Either way, this seems like a hard problem–some Fortran developers seem to find this to be (understandably) a pretty major issue–those are huge rebuild times. With C++ modules on the way with the upcoming standard adoption, I wonder if there isn’t a pretty big opportunity to make module handling a bit slicker across the board, but still seems to me like you end up needing a more sophisticated Fortran parser. I’m not sure if i.e., the compilers might be the place to generate that extra information on the assumption that they might contain the most sophisticated AST parsers rather than duplicating more parsing logic in build systems.

ben.boeckel · August 23, 2021, 12:08am

For C++, the warnings generated from templates usually provide the line numbers from the source file for where to look. So even just whitespace changes is going to meaningfully change module file contents (this is true for imported header units for sure, but even standalone modules are probably going to be sensitive). I wouldn’t hold my breath for there being any improvement even from hash computation on that front.