Good way of running custom commands serially

AnthonyD973 · June 27, 2023, 1:25am

TL;DR: I add_custom_commands that call git clang-format. They are called as part of target (e.g. library) build. The commands don’t create/modify any files but can’t run concurrently. How to enforce only one to run at a time?

I’m writing a CMake module which runs git clang-format on the sources of user-specified targets. One way that git clang-format is used is during the build of the target (it only checks the formatting; doesn’t create/modify anything).

The problem is that git, for very valid reasons, does not like to have multiple processes running on the same repo, and therefore errors out. So I need to run each command serially.

I thought about creating an add_custom_target() for each custom command, and making a chain of DEPENDS between them. But that would not work: not all custom commands should always be run, since the user-specified targets don’t always build (e.g. if lib1’s sources change but not lib2’s, only the checks of lib1 should run).

I also noticed the JOB_POOL on add_custom_command(), but that is Ninja-only, whereas obviously I’d want the CMake module to work with multiple generators.

I’ve seen this thread, but the solution would not apply to this scenario.

So what good way is there of doing this? Or should I just run git clang-format differently?

FWIW, this the CMake module in question.

jtxa · June 29, 2023, 11:28pm

CMake generates build files, the commands are executed by the build tool. If that tool is incapable of avoiding parallelization, I guess there’s nothing CMake can do.
Either the user has to call the build without parallelization, or you need to wrap your command with some semaphore. Like flock on Linux.

But the other question is, why can’t you execute it in parallel. For read only actions the commands should not fail, otherwise I would expect some highlighted warning in the documentation.

AnthonyD973 · June 30, 2023, 11:34am

Yeah, running the command that way seems like an appropriate solution. I’ll give it a try; thanks!

As for why the commands can’t be run in parallel given they are read-only, I’m not sure, but since git clang-format is an extenstion of Git’s CLI (a python script called git-clang-format), it’s possible that the script doesn’t indicate that it is read-only, or that Git just doesn’t support having multiple processes running on the same repo.

fdk17 · July 1, 2023, 8:25pm

When I use add_custom_command which multiple commands they all run serially as described in the CMake manual. Or you could have a script that runs all the desired commands which is also described in the add_custom_command section.

jtxa · July 1, 2023, 8:42pm

That has the same drawback as adding dependencies. I guess you missed that sentence:

The execution shall be independent from each other, but if multiple of them are called, they shall not be in parallel. JOB_POOL would be the right solution, but it just works for Ninja.

ben.boeckel · July 1, 2023, 8:56pm

Job pools are the only reliable mechanism here, but that is ninja-only.

However, I would caution against such a target being on in non-developer builds as clang-format is not stable over time and someone with clang-format-6 may have problems if clang-format-10 has made a pass over it. These kinds of things are (IMO) best done in CI or some other central place prior to merging updates rather than as part of the build or as a commit hook.

AnthonyD973 · July 1, 2023, 9:44pm

Thanks for the help! From all the replies, I think running the command as part of a file-locking mechanism (like flock) would be the only viable solution. Platform compatibility is likely to be an issue (not sure if a readily-available equivalent would exist on Windows), but at least there is a solution.

Letting myself get sidetracked here: yes, for example, I’ve seen this post you’ve made about this specific point. In fact, that’s exactly why my CMake module adds support for git-clang-format, rather than only plain clang-format. I plan to use to introduce this to an existing codebase that isn’t formatted properly, and using normal clang-format would require destroying version control history, even if we put the format checks only on the CI. Although, when using git-clang-format in production, I’m not sure whether automatic fixes might introduce bugs in the code or not, with the whole “consider only the modified lines” thing. I would guess not, but git-clang-format doesn’t seem too widely used so I wouldn’t trust it that easily.

This issue here is that, unlike normal clang-format, git-clang-format relies on git, which breaks when multiple processes are run simultaneously.

ben.boeckel · July 1, 2023, 9:48pm

We have new tools at our disposal:

git blame --ignore-revs-file=
git config blame.ignoreRevsFile to make it more permanent

Then you can --author="clang format <nobody@blameless>" the commit that does the mass reformatting.

Personally, I find mass reformatting and enforcement better than “fixed a typo (and reformatted the file)” style diffs over a longer period of time.