FetchContent cache variables

Leon0402 · July 10, 2020, 4:21pm

Hi everyone,

I wondered what the best method is to pass values for cache variables to a project included with FetchContent.

Say I do want shared libs and no testing for that lib and use some project specific option, I might configure my project like:
cmake -S . -B build -DBUILD_SHARED_LIBS=ON -DBUILD_TESTING=OFF -DSOME_PROJECT_OPTION=ON

There are a couple of issues with that approach:

It might share variables with my project, for example BUILD_TESTING. I probably still want my own tests, just not the project ones
It’s annoying to pass all these variables via the command line, it only makes it more difficult for other projects. I might include 10 subprojects and then I have 100 variables to set on the first run.
It’s especially annoying for Options, where the value is definitely known. Like SOME_PROJECT_OPTION might always be ON, because of how I use the lib

I could probably resolve the second issue by specifiying those in my cmake script with some force magic, although I’m not entirely sure how (I think it could also be problematic that the cmake cache variable behaviours has changed in recent versions). Also that might still be annoying to have a lot of different lines to configure that lib.
The first one could be solved as well by setting the variable in the script and do some restore magic. But that makes the cmake code really complicated.

The best approach would be to be able to just have a command like target_compile_options() and use the name of the lib fetched. There is somthing similar for FetchContent_Declare called CMAKE_ARGS, which looks promising, but doesn’t work.

Is there a possibility to do that more conveniently? Or is fetchContent just not suitable for bigger projects, which need configuration?

sakeans · July 10, 2020, 10:03pm

Hi Leon, I had the same problem and did not found a way to pass variables for FetchContent_Declare. I think it is possible with ExternalProject_Add, but then the script will be longer and won’t have features of FetchContent. However I noticed it was possible to set variables with another function provided by CPM (Setup-free CMake dependency management), which is supposed to be just a wrapper around FetchContent. See the project and it’s single file: https://github.com/TheLartians/CPM.cmake.

CPMAddPackage(
    NAME            fmt
    GIT_TAG         5173a76ba49936d252a85ee49b7eb96e3dff4033 # tag 7.0.0
    GIT_REPOSITORY  https://github.com/fmtlib/fmt.git
    OPTIONS
        "FMT_TEST OFF"
    )

I didn’t understand from the code yet, how they are able to pass the flags, FMT_TEST in this case, forward to FetchContent-functions. Often it feels that CMake is similar to C++ (std) in a way, that it is “standard” library that provides some essential functions and language, but in itself does not provide finished scripts for easy use and final real world solutions. Good that projects like CPM are trying to help.

Leon0402 · July 11, 2020, 2:38pm

That does look in deed nice!

I also figured that a good way to avoid problem 1 is to make one folder for each dependency. In this you could then set all variables without polluting global namespace.

hsattler · July 11, 2020, 2:48pm

You can also use a function to achieve the same.

Leon0402 · July 27, 2020, 3:26pm

Got a few other question regarding fetchContent and dependency managment in general.

The approach with a different scope (directory for each depedency in my case) works quite well for say variables declared option(…). It won’t though if it’s just a set command, this is due to the inconsistency between set(CACHE) and option I believe. What is the best approach to overwrite a cache variable, which might be not declared as option, but rather with the set command?
My second question is regarding folder structure. In Craig’s book there is a recommendation to put dependency stuff in a subfolder dependencies if I understood it correctly. There are two problems / things I wondered
- Should all fetchContent depencencies go in there. For example say I want Catch2 in my project, which is only for testing, if I put it in the dependencies folder I have to check again if BUILD_TESTING is True (compared to when I put the code in the test folder, where I already do that check)
- What’s about other dependencies like such pulled in the porjhect with find_package() … the imported targets are not global, so putting it in the dependencies folder doesn’t work. It seems kinda strange to have almost all dependencies in a clean structure in dependencies and then just a few dependencies in the top level cmake file or in the source folder somewhere.

hex · July 31, 2020, 5:26pm

Dependencies pulled in the project using FetchContent can be consumed with find_package after appending their location to the CMAKE_MODULE_PATH.

If your target has not been globally exported you can still globally import it in your project.

Leon0402 · July 31, 2020, 6:50pm

I’m sorry, but I don’t get the answer. Why would I want to consume dependencies with find_package pulled in by FetchContent?

Dependecies pulled in via FetchContent are already in my project, a simple FetchContent_MakeAvailable will be alright.

The problem was that if I use FetchContent and FetchContent_MakeAvailable all targets exposed by that are global (which is good!). So therefore I can have such a structure

dependencies
- CMakeLists.txt
- dependency 1
  - CMakeLists.txt
- dependency 2
  - CMakeLists.txt
src
- CMakeLists.txt
CMakeLists.txt

And all dependencies pulled in via FetchContent in the dependency folder can be used in the CMakeLists.txt.

But if I use find_package instead of fetchContent, I can do that in the dependency folder. Because targets exposed by find_package are just local or visible for subdirectories (so not global). Perhaps you could do something like make an global alias, but that seems weird to me.

So you have some dependencies nice and clean in the dependency folder, while some have to be in the top level folder or in the src folder. That doesn’t seem consistent to me.

craig.scott · July 31, 2020, 11:09pm

There’s quite a few questions in this thread, but hopefully my response here will at least provide some clarity around the advice my book offers.

The book recommends pulling in dependencies from their own directory scope. The pattern it gives is a simple add_subdirectory(dependencies) from the top level CMakeLists.txt file. One of the advantages the book mentions is that this ensures no non-cache variables used to set up the dependencies can bleed out to other parts of the build accidentally. If you are pulling in your dependencies with FetchContent, you can set variables just before you call FetchContent_MakeAvailable() to influence the dependencies, since they will see these variables. But you might not want the rest of your project to see those variables, they should be considered an implementation detail of the dependencies, if possible. Furthermore, some other project could be consuming yours and it might be pulling the dependency in earlier with a different set of variables, so you can’t rely on a dependency being pulled in exactly how your project sets it up. This might sound like a problem, but if this is what is happening, also remember that the consuming project takes on the responsibility for ensuring its child dependencies still build correctly if they override how that dependency is brought into the build. To maximise the chances of a consuming project being able to do that, the rest of your project should try to avoid relying on these variables and use only the targets that are provided by the dependency. If your project avoids referencing any variables used to configure the dependency or that the dependency provides and only uses its targets, that will give you the greatest robustness and flexibility for consumers.

Another reason for the dedicated dependencies directory is to ensure that all your FetchContent_Declare() calls are made before any calls to FetchContent_MakeAvailable() or FetchContent_Populate(). This is a critical part of what makes FetchContent effective. Always declare all the dependency details before starting to pull in any of them. The first place that calls FetchContent_Declare() for a particular dependency wins. If you delay calling FetchContent_Declare() until after some dependencies have been pulled in, then those dependencies may declare details for further dependencies before you do and then you are not in control of those further dependencies. For cases where you only conditionally need some dependencies, you can put the logic that makes that decision in the top level of the project (either an option() or some non-cache variable that you compute based on whatever conditions you have). You query that in the dependencies folder and you also query that in other parts of the project where needed. For example, the top level CMakeLists.txt file could define a variable MYPROJ_ENABLE_TESTS and in your dependencies folder, you only pull in Catch2 if that variable is set to true. Elsewhere in your project, you only add the tests or build targets related to those tests if MYPROJ_ENABLE_TESTS is true.

As noted earlier in the discussion thread, find_package() creates imported targets that only have local scope. If you call find_package() from within the dependencies directory scope, the rest of the project won’t see them by default. If your project does need them, then you have a few options:

Call find_package() in your dependencies folder with all components defined that any part of your project may need. Then in those relevant parts of the project, call find_package() there as well, potentially with a reduced set of components if relevant. The first time find_package() is called in the dependencies folder, it ensures the dependency can be found and saves its location in the CMake cache. Subsequent calls to find_package() for the same package in other parts of the project will re-use the same location. The package’s imported targets still remain local, but this approach is robust and should still be pretty efficient. Where dependencies are noisy and output a lot of detail to the log (which they shouldn’t, but many do), it can be annoying seeing the same info in the log more than once, but that’s a relatively minor annoyance in most cases.
Call find_package() in your dependencies folder with all components defined that any part of your project may need. For each of the package’s imported targets that you want to make visible to the rest of your project, promote those targets to global visibility by setting their IMPORTED_GLOBAL target property to true. Personally, I’d only use this in very specific cases where you are in full control of the dependency package and that package has no further dependencies of its own. I’d generally avoid this method if you can.
Call find_package() at the top level scope of your project instead of within the dependencies directory scope. This would be a pragmatic choice if you decide you don’t want to call find_package() multiple times for the same dependency for some reason. It could be appropriate, for example, if the call to find_package() is complicated with many options (which should hopefully be rare).

The discussion thread in this post also asks how to influence dependencies that use cache variables instead of non-cache variables for configuring themselves. As @Leon0402 mentioned, the behavior of option() and set() can be different for a boolean cache variable, which is unfortunate (I didn’t push hard enough for set() to be updated when the behavior of the option() command was modified in CMake 3.13 unfortunately). This means that for any variable you want to set before pulling in a dependency, you need to understand how the dependency defines or uses that variable. If it defines it with a set(someVar someValue CACHE type ....) form, then your hands are tied. To ensure your preference is honored, you have to use set() with either INTERNAL or FORCE. I usually go for INTERNAL because I’m forcing an option and therefore the user no longer has a say, so it should no longer show up in the CMake GUI as a user-configurable option. I might do this when my project cannot work unless that option has a certain value, or in commercial software where all uses of projects can be well-defined and enforcing a consistent way of doing something makes sense. If the dependency project uses option() instead and the dependency project requires CMake 3.13 or later (more accurately, if policy CMP0077 will be NEW at the point the dependency calls option()), you can set just a regular non-cache variable in your project and not force any cache value. Otherwise, you have to use set() with INTERNAL or FORCE as for the set() command. If you can’t be sure how the dependency project defines the option, use set() with INTERNAL or FORCE.

Hope that clarifies some of your questions.

Leon0402 · August 1, 2020, 8:50am

Hi Craig,

in deed this clarifies some questions, let me summarize a few things, to ensure I got these right.

Statement 1
You advise to put all all dependencies pulled in via fetchContent in a subdirectory called “dependencies”, no matter if they depend on some variable to be true (in case of testing).

if(build_with_dep)
    FetchContent_Declare(someDep)
endif

...
other declares 
...

if(build_with_dep)
  add_subdirectory(someDep)
endif

So in other word, the advantage of having every dependency in one place, to call first FetchContent_Declare for all dependencies first and to have directory scope for all dependencies, justifies to repeat yourself slightly.

Statement 2
To overwrite variables of a dependencies pulled in via FetchContent use a normal set is possible (e.g. non cache variables and option() cache variables), use set(internal) (includes force) otherwise (e.g. for standard cache variables)

Statement 3
(implicit, as I understand it) If you are a library author you should prefer option() over set().

Statement 4
Use find_package multiple times (the first time in the dependencies) or in the top level file once, if the output is to noisy / the call is complicated …
Promoting to global scope should be rather avoided, but would be another possibility

So I actually wonder two things now (which go more in the direction of a feature request)

Wouln’t it be worth to rethink the behaviour of set()? Obviously you do want to use cache variables for configuration (because variables should be possible to set via cmake gui for instance, if you build the project normally), on the other hand this doesn’t play very nicely with fetchContent
-> What are drawbacks? Why not a policy to controll the behaviour?
Wouln’t it be a good idea to actually make find_package() global? As fetchContent targets are global as well it seems inconsistent to me that this isn’t the case for find_package() … your solution is not bad, but still you have to repeat yourself and if dependencies change, you always have to to it in two places.

Thank you very much, as always, for your very detailed answer!

ClausKlein · February 7, 2021, 11:21pm

One note about my experience: CMP is faster and easy to handle because of its cashing strategy:

# ---- Add dependencies via CPM ----
# see https://github.com/TheLartians/CPM.cmake for more info
option(CPM_USE_LOCAL_PACKAGES "enable find_package for all CPM dependencies" ON)

include(../CPM.cmake)

# CPMAddPackage(NAME PackageProject.cmake GITHUB_REPOSITORY TheLartians/PackageProject.cmake VERSION
# 1.4)
include(../PackageProject.cmake)
# PackageProject.cmake will be used to make our target installable

# to prevent CMake Error: install(EXPORT # "GreeterTargets" ...) includes target "Greeter" which
# requires target "fmt" that is not in any export set.
option(USE_FETCH_CONTENT "to show the problem is not caused by CPM" OFF)
if(USE_FETCH_CONTENT)
    find_package(fmt 7.1)
    if(NOT TARGET fmt::fmt-header-only)
        include(FetchContent)
        FetchContent_Declare(
            fmt
            GIT_REPOSITORY https://github.com/fmtlib/fmt.git
            GIT_TAG 7.1.3
        )
        # NOTE: If fmt is not imported, we need to install it! CK
        option(FMT_INSTALL "" ON)
        FetchContent_MakeAvailable(fmt)
    endif()
else()
    CPMAddPackage(
        NAME fmt
        GIT_TAG 7.1.3
        GITHUB_REPOSITORY fmtlib/fmt
        OPTIONS "FMT_INSTALL ON"
    )
endif()

# ---- Create library using fmt ----

craig.scott · February 8, 2021, 12:23am

I’ve recently been working on the performance of FetchContent (see this issue). Expect to see some pretty significant improvements in CMake 3.20.0 (more than 10x speedup in some cases)

ClausKlein · February 8, 2021, 8:15am

Nice to hear, CMP use FetchContent under the head, so I`m looking forward. Thanks