Help with recompiling on a cluster

Hi all,

I use cmake on a cluster of systems where for the most part I don’t control what node I get when submitting a job.

I find that when I build my cmake project, it does a full recompile all the time instead of incremental. As far as I can tell,
it appears to happen because the job lands on a different host than the last time.

Is there a way in cmake to sidestep this behavior and allow it to treat different nodes as the same for configuration
purposes? All the nodes have the same software configuration and filesystems are all shared.

Thanks
Jerry

1 Like

I looked into it a little bit. Apologies I’m not familiar with compiling on clusters.

Perhaps this blog might interest you:

It talks about using distcc:

“distcc is a program to distribute compilation of C or C++ code across several machines on a network. distcc should always generate the same results as a local compile, is simple to install and use, and is often two or more times faster than a local compile.”

It sounds basically perfect for your use case and has integration with CMake via the usage of CMAKE_LANG_COMPILER_LAUNCHER

Hi and thanks for taking time to respond to my question.

It looks like distcc is a very static environment that needs servers set up for a compiler farm. I would have to launch these servers dynamically (and I don’t get to choose the machines the servers land on), check that they are up and running, then relaunch compilation that can use them. This would be quite messy and from my reading so far, distcc doesn’t seem to address this kind of more dynamic environment.

Perhaps I’m mistaken, but I don’t think it will work amazingly well.

In our env, it’s much simpler to launch a single job with N cores such as 40 to do the compile. As long as I don’t have to recompile the whole tree every time :slight_smile:

I would start by asking the build tool why it thinks it needs to do this. make -d and ninja -d explain for the “normal” tools.

Good advice. So the first recompile happens because there is file out of sync with flags.make. I’m running cmake before compile every time. This wasn’t intentional but I have a simpler makefile that wraps the cmake call since it makes it easy for me to set up debug vs optimized etc builds how I like. I have also run into strange situations that I remedy by removing CMakeCache.txt and rerun cmake.

So what seems to happen is running cmake on a new host rewrites flags.make with -DBUILD_HOST=xxx and makes a bunch of targets out of date, even though nothing else about the configuration has changed.

I would diff them before and after, though CMake should have had any “touches a build file without content changes” issues flushed out long ago, maybe new ones have crept in. If you can make a small reproducer case, please file an issue.

What generator are you using btw @jlquinn ?

Does this happen with Ninja and Unix Makefiles? What version of CMake?

@ben.boeckel the content of the flags.make file does change. It’s just that the BUILD_HOST change is irrelevant in my environment. I suspect what I’m looking for is either a workaround, or extending CMAKE support to better work in such a cluster environment.

@buildSystemPerson I’m using Unix Makefiles backend. The current version I’m using is 3.19.6.

Then the rebuild is correct (as neither make nor CMake has any idea that this flag is irrelevant). I would recommend turning off/commenting out whatever is giving you a BUILD_HOST setting in the first place.

OK I just tested a completely trivial cmakefile that didn’t show this issue. I’ll have to work on creating a reduced test case.

Now that I look in flags.make in my trivial test I don’t see BUILD_HOST. I had assumed that cmake always records it.

Thanks, this gives me direction to debug what’s happening.

I would recommend using cmake --trace-expand to get a log of all the CMake code that is executed. The BUILD_HOST should show up in that log.