add_custom_command with target dependency doesn't rebuild

Here’s a small reproduction:

CMakeLists.txt:

cmake_minimum_required(VERSION 3.21)
project(cc)

add_custom_command(OUTPUT inter.dat
    COMMAND cp ${CMAKE_CURRENT_LIST_DIR}/num inter.dat
    DEPENDS ${CMAKE_CURRENT_LIST_DIR}/num
    )

add_custom_target(inter-target DEPENDS inter.dat)

add_custom_command(OUTPUT meow.cxx
    COMMAND python3 ${CMAKE_CURRENT_LIST_DIR}/gen.py --src inter.dat > meow.cxx
    DEPENDS inter-target
    )

add_executable(meow meow.cxx)

gen.py:

import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--src', type=argparse.FileType('r'))
args = parser.parse_args()

num = args.src.read().strip()
print(f"""#include <iostream>

int main() {{
std::cout << {num} << std::endl;
}}""")

And num just contains 1.

With that:

$ cmake ..
-- The C compiler identification is GNU 8.3.1
-- The CXX compiler identification is GNU 11.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/lib64/ccache/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/lib64/ccache/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/brevzin/sandbox/custom_command/build

$ make && ./meow
[ 25%] Generating inter.dat
[ 25%] Built target inter-target
[ 50%] Generating meow.cxx
[ 75%] Building CXX object CMakeFiles/meow.dir/meow.cxx.o
[100%] Linking CXX executable meow
[100%] Built target meow
1
$ echo 2 > ../num
$ make && ./meow
[ 25%] Generating inter.dat
[ 25%] Built target inter-target
Consolidate compiler generated dependencies of target meow
[100%] Built target meow
1
$ cat inter.dat
2

The rule for meow.cxx DEPENDS on inter-target which DEPENDS on inter.dat. Here, inter.dat did get rebuilt, but meow.cxx did not, it still prints 1.

The docs say that:

If the argument is the name of a target […] a target-level dependency is created to make sure the target is built before any target using this custom command. Additionally, if the target is an executable or library, a file-level dependency is created to cause the custom command to re-run whenever the target is recompiled.

But then the docs also say:

Do not list the output in more than one independent target that may build in parallel or the two instances of the rule may conflict (instead use the add_custom_target() command to drive the command and make the other targets depend on that one)

So if I want things to rebuild properly, I need to use the filename, because a target doesn’t trigger a rebuild. But I want things to build in parallel at all, I need to use a custom target, because otherwise multiple rules conflict.

How… am I supposed to get this to work?

Is the inter-target actually used for something else? Typically I try to keep a chain of custom commands and only “cap” it at the end with a custom target. The docs are consistent with this; I suspect that an “order only” dependency is used in Make and Ninja generators (not sure what IDEs use) which means that rerunning the intermediate doesn’t necessarily trigger downstream bits as needing to rerun. I don’t know that there’s a mechanism for custom targets to work in this way right now.

Cc: @brad.king

From the docs quoted above:

if the target is an executable or library, a file-level dependency is created

DEPENDS inter-target will not add a file-level dependency because inter-target is a custom target, not an executable or library. Only the target-level ordering dependency is added, but that just means that if the custom command does run, inter-target will be up-to-date.

Try making the second custom command depend on the target and the file: DEPENDS inter-target inter.dat. That will cause the meow target to get an ordering dependency on inter-target, so first custom command producing inter.dat will only be attached to inter-target rather than duplicated. The file-level dependency will cause the latter custom command to rebuild when neeed.

Confirmed that this does work - it both only runs the initial add_custom_command command one time (unlike with DEPENDS inter.dat which runs the command for every dependency) and actually reruns downstream targets when the upstream changes (unlike with DEPENDS inter-target which does not actually rebuild anything).

However, this really seems like a cmake bug. When I write something like:

add_custom_command(OUTPUT x ...)
add_custom_command(OUTPUT y DEPENDS x ...)
add_custom_command(OUTPUT z DEPENDS x ...)

That really looks like I’m expressing the idea correctly. I need to generate x, and use the output of x to generate y and z. But that doesn’t work. Instead I have to do this extra dance:

add_custom_command(OUTPUT x ...)
add_custom_target(x-target DEPENDS x)
add_custom_command(OUTPUT y DEPENDS x x-target ...)
add_custom_command(OUTPUT z DEPENDS x x-target ...)

which seems like the thing I would always want to do in this scenario, since anything else is simply wrong (it either is broken for parallel builds or doesn’t actually rebuild everything that needs to be rebuilt).

I would really like to stress that (a) the docs explicitly state to use a custom_target for this, without pointing out that this does not work and (b) the docs do not actually say how to solve this problem.

Moreover, the docs state that

In makefile terms this creates a new target in the following form:

OUTPUT: MAIN_DEPENDENCY DEPENDS
        COMMAND

Which sure makes it look like my example above with the three add_custom_commands and no add_custom_target would generate:

x : 
    x-rule

y : x
    y-rule

z : x
    z-rule

That’s precisely the set of rules I’m trying to generate. It’s just that something about the makefile generator that causes x-rule to be run twice. The Ninja generator actually works with just DEPENDS x, it does not need DEPENDS x x-target.

That is correct and does work, if all the outputs go in the same consuming target, and does build them in parallel.

Once you want to start assigning different outputs to different targets, it becomes your responsibility to add them to targets whose ordering dependencies match the build dependencies.

It does not work, at least not for the Makefile generator. Let me demonstrate by extending the above example slightly:

cmake_minimum_required(VERSION 3.21)
project(cc)

add_custom_command(OUTPUT inter.dat
    COMMAND python3 cp.py > ${CMAKE_BINARY_DIR}/inter.dat
    DEPENDS ${CMAKE_CURRENT_LIST_DIR}/num
    WORKING_DIRECTORY ${CMAKE_CURRENT_LIST_DIR}
    )

#add_custom_target(inter-target DEPENDS inter.dat)

foreach(i RANGE 1 10)
    add_custom_command(OUTPUT meow-${i}.cxx
        COMMAND
            python3 ${CMAKE_CURRENT_LIST_DIR}/gen.py --src inter.dat > meow-${i}.cxx
        DEPENDS inter.dat
        )

    add_executable(meow-${i} meow-${i}.cxx)
endforeach()

Here, I have 10 programs meow-N which are built from a corresponding generated meow-N.cxx which is itself built from a single global generated inter.dat which is generated from num. Similar to what I pasted originally, except now I’m depending on inter.dat (the OUTPUT) instead of inter-target (the custom_target). Also, instead of cp I’m running a script cp.py which just also prints that it’s running, no surprises.

When I build this, this happens:

$ cmake ..
-- The C compiler identification is GNU 8.3.1
-- The CXX compiler identification is GNU 11.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/lib64/ccache/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/lib64/ccache/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/brevzin/sandbox/custom_command/build

$ make -j30
[  2%] Generating inter.dat
[  5%] Generating inter.dat
[  7%] Generating inter.dat
[ 12%] Generating inter.dat
[ 12%] Generating inter.dat
[ 25%] Generating inter.dat
[ 25%] Generating inter.dat
[ 25%] Generating inter.dat
[ 25%] Generating inter.dat
[ 25%] Generating inter.dat
Running cp.py
Running cp.py
Running cp.py
Running cp.py
Running cp.py
Running cp.py
Running cp.py
Running cp.py
Running cp.py
Running cp.py
[ 32%] Generating meow-9.cxx
[ 32%] Generating meow-6.cxx
[ 32%] Generating meow-10.cxx
[ 35%] Generating meow-1.cxx
[ 37%] Generating meow-2.cxx
[ 45%] Generating meow-8.cxx
[ 50%] Generating meow-5.cxx
[ 50%] Generating meow-3.cxx
[ 50%] Generating meow-4.cxx
[ 50%] Generating meow-7.cxx
[ 57%] Building CXX object CMakeFiles/meow-9.dir/meow-9.cxx.o
[ 57%] Building CXX object CMakeFiles/meow-10.dir/meow-10.cxx.o
[ 60%] Building CXX object CMakeFiles/meow-6.dir/meow-6.cxx.o
[ 60%] Building CXX object CMakeFiles/meow-2.dir/meow-2.cxx.o
[ 62%] Building CXX object CMakeFiles/meow-1.dir/meow-1.cxx.o
[ 70%] Building CXX object CMakeFiles/meow-4.dir/meow-4.cxx.o
[ 75%] Building CXX object CMakeFiles/meow-5.dir/meow-5.cxx.o
[ 75%] Building CXX object CMakeFiles/meow-7.dir/meow-7.cxx.o
[ 75%] Building CXX object CMakeFiles/meow-8.dir/meow-8.cxx.o
[ 75%] Building CXX object CMakeFiles/meow-3.dir/meow-3.cxx.o
[ 87%] Linking CXX executable meow-10
[ 87%] Linking CXX executable meow-6
[ 87%] Linking CXX executable meow-2
[ 87%] Linking CXX executable meow-1
[ 87%] Linking CXX executable meow-9
[100%] Linking CXX executable meow-7
[100%] Linking CXX executable meow-4
[100%] Linking CXX executable meow-5
[100%] Linking CXX executable meow-8
[100%] Linking CXX executable meow-3
[100%] Built target meow-9
[100%] Built target meow-1
[100%] Built target meow-2
[100%] Built target meow-6
[100%] Built target meow-10
[100%] Built target meow-8
[100%] Built target meow-7
[100%] Built target meow-4
[100%] Built target meow-5
[100%] Built target meow-3

My file here, inter.dat, is generated 10 times, in parallel. In this particular example, because the file contains a single byte, this works out fine. But if the file was much longer and the generation step took longer, then I would likely have different parts of my builds reading incomplete files at different times.

That is very broken.

With Ninja, however, this does work. The file is only generated one time, not ten:

$ cmake .. -G Ninja
-- The C compiler identification is GNU 8.3.1
-- The CXX compiler identification is GNU 11.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/lib64/ccache/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/lib64/ccache/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/brevzin/sandbox/custom_command/build

$ ninja
[1/31] Generating inter.dat
Running cp.py
[31/31] Linking CXX executable meow-1

I am though. meow-N depends on meow-N.cxx depends on inter.dat depends on num. Those dependencies are very clearly spelled out in the CMake above, in a way that’s a fairly direct translation of this Makefile:

inter.dat : num
    python3 cp.py > inter.dat

meow-1.cxx : inter.dat
    python3 gen.py --src inter.dat > $@

meow-1 : meow-1.cxx
    g++ $^ -o $@

meow-2.cxx : inter.dat
    python3 gen.py --src inter.dat > $@

meow-2 : meow-2.cxx
    g++ $^ -o $@

# repeated 8 more times

With Make, the dependencies track right and the rule for generting inter.dat is run exactly once.

The problem is that, with the CMake version, the rule for generating inter.at is run lots of times with the Makefile generator.

From my previous post:

That is correct and does work, if all the outputs go in the same consuming target

Your example has multiple separate consuming targets.

With Ninja, however, this does work

Ninja has a single monolithic build graph. The other generators don’t, and so custom commands need to each be assigned to the build systems for specific targets. No single make invocation sees the whole graph. Custom commands are assigned to targets based on what targets consume their output. Custom commands for transitive dependencies go into each least-dependent target that needs them. Without a common target dependency to contain the custom command for the common dependency, it gets duplicated. That’s why you need to manually create an extra target and explicitly add dependencies on it.

Everything is working as designed. CMake MR 8002 updates the documentation to add an example for this use case.

Can you just… fix the design?

CMake has enough information to simply do this the right thing. I could even implement my own version of add_custom_command() that:

  1. Creates a new custom target that DEPENDS on all the outputs.
  2. Attaches the name of that target as a property on all those outputs.
  3. Goes through all the DEPENDS and, if any of them have this property attached to them, add that property as a dependency to the whole thing as well.

I see no reason why CMake couldn’t just do the right thing for users if the generator doesn’t support this natively.

As the linked request states:

This is a common use case that requires care to express correctly.

I am very much requesting that it simply not require care to express correctly. While I appreciate that this is at least documented now, it would be significantly better for users if the expected thing just worked.

1 Like

Having now seen the example, I don’t even understand why that’s the solution to the problem. It just looks wrong (I’ll expand on this in a bit).

Extending my example:

  add_custom_command(OUTPUT inter.dat
      COMMAND python3 cp.py > ${CMAKE_BINARY_DIR}/inter.dat
      DEPENDS ${CMAKE_CURRENT_LIST_DIR}/num
      WORKING_DIRECTORY ${CMAKE_CURRENT_LIST_DIR}
      )

+ add_custom_target(inter-target DEPENDS inter.dat)

  foreach(i RANGE 1 10)
      add_custom_command(OUTPUT meow-${i}.cxx
          COMMAND python3 ${CMAKE_CURRENT_LIST_DIR}/gen.py --src inter.dat > meow-${i}.cxx
          DEPENDS inter.dat
          )

      add_executable(meow-${i} meow-${i}.cxx)
+     add_dependencies(meow-${i} inter-target)
  endforeach()

This now works. It (1) builds inter.dat exactly once and (2) if I update num, all the meow-N.cxxs get rebuilt and all the meow-Ns get rebuilt as desired. Perfect.

But what is the mental model for why this works? It looks like the graph here is (sorry, I don’t know how else to express this):

  meow-1 -> meow-1.cxx
  meow-1.cxx --> inter.dat
  inter.dat --> num
  inter-target --> inter.dat
+ meow-1 --> inter-target

The meow-1 --> inter.dat dependency is already expressed in the code (meow-1 --> meow-1.cxx --> inter.dat), so it’s unclear to me what this added extra dependency buys or how it solves the problem.

It also seems like the wrong layer of graph to add a dependency. Since really it’s meow-1.cxx that depends on that target, not meow-1. The target needs to have been built in order to generate this source file. This is a common mistake people make with Makefiles, where they express the rules a : b and a : c when they really also needed b : c.

I guess from your description it unifies the underlying make graphs somehow, but I wouldn’t know that this is necessary without having a deep understanding of cmake internals.

It really just gives it a single home. Without these explicit links, CMake prefers to “add the command where needed” so that make meow-N works. The Makefiles generator needs to be reworked to have a single unified view of the build graph at some point. Until then, this kind of difference is just a legacy of old CMake design decisions (here, the ability to make in any arbitrary directory preferred over a single build graph) made without full knowledge of the future.