Using a fetched cross-compiler toolchain

Hi, I have inherited a system on short notice in the midst of being cmakeified. For $reasons not a lot of explanation came with it, so I’m left to puzzle it out and finish it.

My first issue is this: we have an embedded toolchain being vendorized with FetchContent. OK, but how do I specify that toolchain when it doesn’t exist at configure time (or at least at the start of configure time? Previously I’ve only used CMake with an already installed toolchain that I can specify on the cmake command line.

The toolchain only needs to exist at the point something wants to use it. That will most likely be the first project() call, which tests the compilers for the enabled languages (C and CXX by default if no languages are explicitly given in the project() command). If you are retrieving the toolchain via FetchContent before the first project() call, it should work okay.

I’m assuming you are also using a toolchain file? Is that toolchain file also downloaded via FetchContent or does it exist already before you run CMake for the first time? Both can work, it should follow the same rules as for the toolchain in the above. As long as it exists at the point something needs it (the first project() call), it should be fine. The toolchain file should point to the toolchain at its downloaded location. That does mean you will need to know where FetchContent will put it, but that is documented and predictable.

If you need further clarification, then perhaps if you can post your toolchain file and the contents of your CMakeLists.txt file up to the first project() call, that may make it easier to provide more guidance.

It sounds like maybe the real question should be “what is best practice to do …”, since I can change anything I need to. I’ll describe what I want in more detail, but first let me answer the questions.

  1. Ah hah. No, project() is called right after cmake_minimum_required() in the top level CMakeLIsts.txt file. It sounds like the add_subdirectory(build) call, where the cmake file with the FetchContent calls, should come before the project() call?

While I’m at it, I should probably rename that directory because build/ implies something different in the cmake community.

  1. I take it that is the answer to how you specify the toolchain on the command line, set a variable to a toolchain file rather than just to the compiler name. Makes a lot of sense.

It’s confusing because this project was using bazel, so there are toolchain files laying around for bazel. I didn’t find any for CMake, but the vendorized toolchain (esp-idf) looks like it has one. That’s presumably correct, but it won’t exist until after the FetchContent call. Is it best practice to just copy their toolchain file into our repo, or is there a way to set it after FetchContent? If the latter, then do we just create a CMake variable to identify what build we want to do?

What I really want to do is be able to build code for multiple platforms (to start with, esp32 and soon ARM, plus preferably on OSX and Linux so we can test as much as possible locally and with Jenkins without flashing the boards.

It seems best to vendorize the toolchains because I can pin them to a specific version and make the builds as reproducible as possible. But if the overall approach I’m describing is misguided, tell me the better way. It’s in my lap, so I can change whatever is needed.

I’ve seen in the CMake tests where project() is used as project( test NONE ) and then the language is enabled via enable_language( C ). I suppose downloading and installing a toolchain between the two could also work.

I use FetchContent in my CMake toolchain file to download the SDK. The only drawback is that CMake searches for ninja after including the toolchain file but FetchContent needs it. Specifying the ninja tool manually solved it.

It’s hard to recommend a “best practice” for this, since it depends on what your goals are. There are different approaches, such as that mentioned by @hsattler, but also others.

Toolchain files get called any time a language is enabled which wasn’t enabled before. Most of the time, this is just the first project() call, but it can come later via explicit enable_language() calls or another project() call which adds more languages. UPDATED: The first project() call will (also) read the toolchain file before any languages are enabled - it may have always been that way, I don’t recall. This is a minor detail though, it doesn’t change the rest of the guidance in my comments here.

The toolchain file also gets read for try_compile() builds, which are separate sub-builds isolated from your main build. You probably don’t want the try_compile() calls downloading their own separate copy of fetched contents, so you’d need to consider how to prevent that if you took the approach mentioned by @hsattler.

You can put toolchain files in a remote repo and have FetchContent download that remote repo before the first project() call. The user can then select one of the downloaded toolchain files if they want to by passing the path to the toolchain file where it will be after download. This requires that you know where the toolchain file you want will be located, but if your remote repo has a flat structure and only has toolchains, it can be fairly straightforward. Overriding the default download location for the repo can also make it simpler. Here’s a slightly simplified example from the FetchContent section of my book (Professional CMake: A Practical Guide):

cmake_minimum_required(VERSION 3.14)

include(FetchContent)
FetchContent_Declare(CompanyXToolchains
    GIT_REPOSITORY ...
    GIT_TAG ...
    SOURCE_DIR ${CMAKE_BINARY_DIR}/toolchains
)
FetchContent_MakeAvailable(CompanyXToolchains)

project(MyProj)

You can then invoke cmake like so:

cmake -DCMAKE_TOOLCHAIN_FILE=toolchains/beta_cxx.cmake ...

Note how we didn’t specify the full absolute path to the toolchain file. CMake will look in both the source directory and build directory when given a relative path for the toolchain file. The FetchContent_Declare() call overrides the SOURCE_DIR to place the downloaded toolchains in a toolchains subdirectory below the top of the build tree, so our relative path will find the toolchain files with the simple relative path shown. You would typically only want to do this for situations where you know the subdirectory name you choose will never clash with other parts of the project which may try to do the same thing (think about hierarchical projects), such as in a company environment where the set of projects that may be involved in a build is well known and under your control.

An advantage of putting your toolchain files in a dedicated repo like this is that the user is still free to choose whether they want to actually use the supplied toolchains or not. If they want to try out a different toolchain file, they can specify that on their cmake command line instead of one of the toolchain files from the downloaded repo. They are still free to have their own toolchain file refer to things inside the downloaded toolchains repo. This might be handy when experimenting with a new or updated toolchain file.

I hope it’s alright to respond to this older email thread.

We have been trying to follow above suggestion, since we have multiple repositories which we would like to build independently using a cross-compiler toolchain, and we would like to eliminate duplication of the toolchain files among those various repositories.

So I tried putting the FetchContent_Declare and FetchContent_MakeAvailable of the repo that contains the toolchain files above the first Project statement. However, FetchContent_MakeAvailable first creates its repo-subbuild directory from which it will eventually download the targeted repo, but that repo-subbuild directory contains a CMakeLists with a “project(repo-populate NONE)” command. This project command initiates loading the toolchain, which is not yet downloaded.

Is there a way around this? Theoretically, that project does not list any languages, so CMake could choose to not load the toolchain. But according to the documentation of “project”, the toolchain file is loaded at least once.

Could it be the case that this behaviour was recently changed? Or is this just happening when using Visual Studio’s CMake? Is there still another thing that I need to do in order to have a repository with toolchain files being made available with FetchContent?

Unless you want to be able to build your toolchains repo as a standalone project (more on that shortly), you don’t need to put a project() command there. The main purpose of the toolchains repo is to be added to a parent project, and in that scenario you want the parent project’s top level CMakeLists.txt file to make the first call to project().

It may, however, still be useful to build the toolchains repo as a standalone project. When I use this pattern with my consulting clients, I set the toolchains repo up with some basic smoke tests to verify that the toolchain files work. In order to do that, you have to conditionally call project() in the toolchains repo’s CMakeLists.txt file only if the toolchains repo is the top level. The pattern looks something like this:

cmake_minimum_required(VERSION 3.21)

# Do setup things here common to both standalone and added-to-parent scenarios.
# Normally, I put that logic in a separate file and include it here.
include(some_standard_setup.cmake)

# Only call project() for standalone builds
if(CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DIR)
    project(CompanyXToolchains)
    # Add some smoke tests that will verify that the toolchain can build some simple targets
    add_subdirectory(src)
endif()

I would also add a CMakePresets.json file to the toolchains repo. It would define presets that exercise each of the toolchain files the repo provides. With that, it becomes relatively straightforward to add CI builds for the toolchains repo and to test them out locally.

Yes, I updated my previous comment to more accurately reflect when the toolchain file is loaded.

I’m not 100% certain if it was always like this, but definitely since CMake 3.24 we have things that rely on the toolchain file being loaded once at the first project() call before any languages are enabled. The CMAKE_PROJECT_TOP_LEVEL_INCLUDES variable specifically injects files after the toolchain file has been read, but before any languages are enabled. While I haven’t checked, I suspect earlier CMake versions always read the toolchain file when hitting the first project() call, but I have some vague recollection that there was something different with the MSVC toolchain in the past where maybe something was delayed until the first language was enabled. That might not be the toolchain file though, I could be thinking of something else. You’d have to go digging back through the git history or gitlab merge requests to know for sure (might be tricky to track it down).

I am afraid I did not sufficiently explain this. It is not me who puts a “project()” command there, but the execution of FetchContent_MakeAvailable creates a file structure, where it wants to create a _deps/toolchain-repo-src and a _deps/toolchain-repo-build, and it populates _deps/toolchain-repo-src by creating a _deps/toolchain-repo-subbuild directory with a CMakeLists.txt that downloads the content that is to be fetched. That _deps/toolchain-repo-subbuild/CMakeLists.txt contains the problematic “project(toolchain-repo-populate NONE)” command.

When using FetchContent_MakeAvailable in a parent project that wants to download toolchain-repo in order to use the toolchain files from that repo, I therefore run into the problem that just before toolchain-repo is downloaded, the _deps/toolchain-repo-subbuild/CMakeLists.txt already puts down the first “project()” invocation.

So I am still wondering:

How is this possible, when FetchContent itself puts in a project() command?

The sub-build that FetchContent creates does not use any toolchain file by design. It also does not enable any languages (that’s the effect of the NONE argument to project()), so no toolchain should actually be needed to process that sub-build. The sub-build only has to have a build tool available (ninja, make, or whatever the generator you’re using expects).

My earlier comment about the first project() call was referring to the first project() call within the main project, not the sub-build.

This does not work for me when I use Visual Studio 2022. However, when I use cmake from the command line, then it works. So apparently, the cmake in Visual Studio does things a little bit differently.

My hope is vanishing that I can get this to work in Visual Studio, but can you have a look to see if there’s something that I can change, so that it works in Visual Studio too? I’ll try to show in detail what I have and what the error is.

This is the top part of the top level CMakeLists.txt of amp-hal-st, the project that tries to use a repository to obtain the toolchain from:

cmake_minimum_required(VERSION 3.24)

if (CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DIR)
    set(HALST_STANDALONE On)
endif()

if (HALST_STANDALONE)
    include(FetchContent)

    FetchContent_Declare(
        emil
        GIT_REPOSITORY https://github.com/philips-software/embeddedinfralib.git
        GIT_TAG        modern-cmake
    )

    message("Before FetchContent_MakeAvailable")

    FetchContent_MakeAvailable(emil)

    message("After FetchContent_MakeAvailable")
endif()

This is a configurePreset from CMakePresets.json, which I selected in Visual Studio 2022. Its toolchainFile points into the soon-to-be downloaded repository that holds the toolchain file:

    {
      "name": "stm32wb55cg",
      "displayName": "stm32wb55cg",
      "description": "Build for stm32wb55cg",
      "inherits": "stm32",
      "toolchainFile": "${sourceDir}/build/stm32wb55cg/_deps/emil-src/cmake/toolchain-arm-gcc-m4-fpv4-sp-d16.cmake",

When I try to configure with that preset I run into this error:

1> Working directory: C:/DEV/Metronome/amp-hal-st/build/stm32wb55cg
1> [CMake] Before FetchContent_MakeAvailable
1> [CMake] CMake Error at C:/Program Files/Microsoft Visual Studio/2022/Community/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.24/Modules/CMakeDetermineSystem.cmake:130 (message):
1> [CMake]   Could not find toolchain file:
1> [CMake]   C:/DEV/Metronome/amp-hal-st/build/stm32wb55cg/_deps/emil-src/cmake/toolchain-arm-gcc-m4-fpv4-sp-d16.cmake
1> [CMake] Call Stack (most recent call first):
1> [CMake]   CMakeLists.txt:10 (project)
1> [CMake] CMake Error: CMake was unable to find a build program corresponding to "Ninja Multi-Config".  CMAKE_MAKE_PROGRAM is not set.  You probably need to select a different build tool.
1> [CMake] -- Configuring incomplete, errors occurred!
1> [CMake] 
1> [CMake] CMake Error at C:/Program Files/Microsoft Visual Studio/2022/Community/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.24/Modules/FetchContent.cmake:1589 (message):
1> [CMake]   CMake step for emil failed: 1
1> [CMake] Call Stack (most recent call first):
1> [CMake]   C:/Program Files/Microsoft Visual Studio/2022/Community/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.24/Modules/FetchContent.cmake:1741:EVAL:2 (__FetchContent_directPopulate)
1> [CMake]   C:/Program Files/Microsoft Visual Studio/2022/Community/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.24/Modules/FetchContent.cmake:1741 (cmake_language)
1> [CMake]   C:/Program Files/Microsoft Visual Studio/2022/Community/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.24/Modules/FetchContent.cmake:1955 (FetchContent_Populate)
1> [CMake]   CMakeLists.txt:18 (FetchContent_MakeAvailable)
1> [CMake] -- Configuring incomplete, errors occurred!

There are two errors here. First, it says that it cannot find the toolchain file. Second, it says that it is unable to find a build a build program for Ninja. However, that seems to be just follow-up damage; if I remove the generator in the configurePreset, then that second error disappears but the first error about the missing toolchain file remains.

After producing these errors, the amp-hal-st/build/stm32wb55cg directory looks like this:

.cmake
    <more files>
_deps
    emil-subbuild
        CMakeFiles
            <more files>
        CMakeCache.txt
        CMakeLists.txt
CMakeFiles
    <more files>

_deps/emil-subbuild/CMakeLists.txt contains the offending “project(emil-populate NONE)” call:

# Distributed under the OSI-approved BSD 3-Clause License.  See accompanying
# file Copyright.txt or https://cmake.org/licensing for details.

cmake_minimum_required(VERSION 3.24.202208181-MSVC_2)

# We name the project and the target for the ExternalProject_Add() call
# to something that will highlight to the user what we are working on if
# something goes wrong and an error message is produced.

project(emil-populate NONE)

That is what the error points to: line 10 of that file contains a “project” call, the toolchain file is necessary, that toolchain file is not yet present at _deps/emil-src/cmake because it hasn’t been downloaded yet, and then the configuration fails. The display of the message “Before FetchContent_MakeAvailable” and the absence of the message “After FetchContent_MakeAvailable” show that this error occurs during the FetchContent_MakeAvailable call itself.

Is there anything that I can configure so that Visual Studio is able to FetchContent the repo with the toolchain?

Edit:

Apparently as a new user I’m limited to only three posts in a subject, and I’ve only got a small note to add:

Thank you very much for your response, I’ve got it now working!

The workaround is completely acceptable for me.

Thanks for the additional information. I was able to reproduce your problem. It looks like a combination of Visual Studio doing something it shouldn’t, but also FetchContent not considering the general mechanism that Visual Studio (in my view erroneously) uses.

What is happening is that Visual Studio sets not just the CMAKE_TOOLCHAIN_FILE cache variable from the preset (which is good), it also sets the CMAKE_TOOLCHAIN_FILE environment variable (which is the problem here). The latter is unnecessary because it is ignored by the main build when the cache variable of the same name is already set. The problem is that in the sub-build, while FetchContent doesn’t pass down the cache variable, it also doesn’t expect the environment variable to be set. The sub-build then sees the environment variable and uses it to initialise the CMake variable.

I’ll report upstream to Visual Studio about the questionable environment variable being set, but there will also need to be a change to FetchContent to make it force using no toolchain file regardless of the environment variable being set or not.

A workaround in the meantime is to explicitly set the CMAKE_TOOLCHAIN_FILE environment variable in your CMakePresets.json file to an empty string.

1 Like