configure-step dependencies

Here is a reduced example:

$ tree
.
├── cmake
│   └── foo.cmake
├── CMakeLists.txt
├── meow.py
└── woof.py

with the main CMakeLists.txt

cmake_minimum_required(VERSION 3.21)
project(execute)

set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/target/bin)

list(APPEND CMAKE_MODULE_PATH ${CMAKE_CURRENT_LIST_DIR}/cmake)
include(foo)

add_program(meow meow.py)
add_program(woof woof.py)

and cmake/foo.cmake definined that add_program function:

function(add_program target file)
    set(directory ${CMAKE_CURRENT_BINARY_DIR}/${target}/generated)
    file(MAKE_DIRECTORY ${directory})

    execute_process(
        COMMAND python ${file}
        OUTPUT_FILE ${directory}/out.cxx
        WORKING_DIRECTORY ${CMAKE_CURRENT_LIST_DIR})

    set_property(
        DIRECTORY
        APPEND PROPERTY CMAKE_CONFIGURE_DEPENDS
        ${file})

    add_executable(${target} ${directory}/out.cxx)
endfunction()

where the two python files are just very silly, meow.py looks like this and woof.py just prints woof instead:

print("""
#include <iostream>

int main() {
    std::cout << "meow\\n";
}
""")

This works:

$ mkdir build && cd build
$ cmake ..
$ make -j30
$ ./target/bin/meow
meow
$ ./target/bin/woof
woof

So far so good. Now, when I change meow.py, and I recompile, that does recompile meow. But it also recompiles woof:

$ touch ../meow.py
$ make -j30
-- Configuring done
-- Generating done
-- Build files have been written to: /home/brevzin/sandbox/execute_process/build
Consolidate compiler generated dependencies of target woof
Consolidate compiler generated dependencies of target meow
[ 25%] Building CXX object CMakeFiles/woof.dir/woof/generated/out.cxx.o
[ 50%] Building CXX object CMakeFiles/meow.dir/meow/generated/out.cxx.o
[ 75%] Linking CXX executable target/bin/woof
[100%] Linking CXX executable target/bin/meow
[100%] Built target woof
[100%] Built target meow

How do I make this only recompile meow?

This unconditionally overwrites the file. You want to instead do one of:

Generate during the build

My preferred solution. Use add_custom_command to generate the file during the build. This command would DEPENDS on the Python script.

Copy-if-different

Instead of writing directly to the source file during the configure, write to a “temp” place, then file(COPY_IF_DIFFERENT) to the actual source location.

1 Like

Yeah changing:

execute_process(... OUTPUT_FILE out ...)

to

execute_process(... OUTPUT_FILE out-tmp ...)
file(COPY_FILE out-tmp out ONLY_IF_DIFFERENT)

does the trick.

This does still run all the execute_process commands though. Is there a way to only run the necessary one(s)?

(Note that I have over-reduced this example - the add_custom_command approach is not viable because in reality I’m generating a CMakeLists.txt that needs to be included as part of the build, which CMake does not allow me to do during the build phase).

If you have some way of knowing that the output is consistent with the input, sure. if (path1 IS_NEWER_THAN path2) exists if that helps.

1 Like

Okay yep, between the two that’ll do it! :+1:

So I have a combination of:

if(${command_file} IS_NEWER_THAN ${output_file})
    execute_process(... OUTPUT_FILE ${output_file}.tmp ...)
    file(COPY_FILE ${output_file}.tmp ${output_file} ONLY_IF_DIFFERENT)
endif()

That seems to both (a) only execute the process if necessary and also then (b) only rebuild if necessary (so that touch meow.py does call meow.py again but won’t rebuild meow since its out.cxx hasn’t changed).

Seems like at least the only-re-execute-if-changed part should be a parameter on execute_process, something like:

execute_process(... DEPENDS ${command_file})

In the same way that this works for add_custom_command

execute_process is used to just run a command to query something and stuff it into a cache variable. It usually isn’t involved in the build graph at all. Other things to consider that aren’t handled:

  • the output file doesn’t exist yet
  • the output is a -o out flag, not a stdout capture

It seems to me that add_custom_command(IMMEDIATE) is probably a better foundation to build something like that from (but still seems very niche to me). I don’t know how many are generating configure code in such a complicated way.

Sure, I totally agree that add_custom_command would be a better solution for this (though probably like CONFIGURATION_TIME rather than IMMEDIATE).

The real problem I have is that I need to generate part of my build outside of cmake. Because cmake can only include other cmake files during configuration, this generation needs to happen during configuration. If cmake let me do that during build, then I would happily use add_custom_command but that’s not an option, so I’m left trying to… effectively implement a configuration-time add_custom_command.

I’m not sure how niche this is. Ultimately I’m asking cmake to handle dependencies properly during configuration time as well as during build time. In trying to figure out how to do this, I’ve run into quite a few other half-solutions to this problem that each have their own limitations (e.g. running ExternalProject_Add to emit that other cmake build then building that one appears to be a common solution, but then you can’t bring in cmake library dependences from the rest of the project).

Well, CMake isn’t a build tool and really just knows how to construct DAGs, not execute them. Getting it Just Right™ will be tedious, but I suspect that this might be one time where committing generated files is better. The pattern there is usually to commit some version of the generated file(s), have add_custom_commands that generate new files into the build tree, another add_custom_command that detects whether the files are different, and an add_custom_target (to be run manually) that copies them back to the source tree.