(hear me out) FetchContent or ExternalProject_Add without a download?

Rich_von_Lehe · September 20, 2022, 9:23pm

I have to preface my question with a brief explanation of why I’d even ask this because I know what I’m asking about is far from idiomatic use of CMake.

I am limited to a single repository for a work project. This repository has multiple distinct projects within it but these are all built using a single CMake mega-project. The gymnastics involved in maintaining this setup have become unbearable and my proposal to break things into separate repos and manage dependencies with FetchContent was unfortunately not welcomed.

So my question is: Can I use FetchContent (or ExternalProject) without using external URLs? My goal would still be to have distinct top-level CMakeLists.txt files for each logically distinct project within the repo, but it would have to get our own library dependencies from the same repo. I think I’ve seen examples with ExternalProject where the URL is not given, as an example something like:

cmake_minimum_required(VERSION 3.5)

project(app2 LANGUAGES CXX)

include(ExternalProject)
ExternalProject_Add(lib2
    SOURCE_DIR ${ROOT_DIR_ENV}/demo/libs/lib2
)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

add_executable(app2 main.cpp)

target_link_libraries(app2 PRIVATE lib2)

where source code for lib2 is already available on the local file system. This attempt didn’t get far and I wanted to ask if something like this is even possible.

If I could use one of these two means of managing dependencies it would still potentially set things up in the future to do it the right way. Any suggestions are welcome.

craig.scott · September 20, 2022, 11:02pm

Can you clarify why you don’t want to simply add them directly to the build with add_subdirectory()? Do you need them to be separate builds to the main build?

FWIW, there’s no problem omitting a download method and using SOURCE_DIR in situations where you already have the content somewhere you’re happy to use it from (in your case, already in the repo). I’ve occasionally done that for some projects. I would typically only use ExternalProject rather than FetchContent in such cases if I needed to mix different toolchains, or if there were target name clashes that I needed to avoid.

scivision · September 21, 2022, 2:23am

Yes I use ExternalProject like this. Also you can have the external projects each in an archive file and use URL like

ExternalProject_Add(my1
URL /path/to/my1.zip
)

ExternalProject_add(my2
URL /path/to/my2.zip
)

Rich_von_Lehe · September 21, 2022, 2:33am

Yes, good question. The use of add_subdirectory() was being done but as I alluded to, there were too many distinct projects (physically distinct devices) that were being managed by this one project. What resulted was a large number of platform flags, and different flags to enable the tests that worked only on certain platforms, flags added by projects that wanted Qt-based libraries or didn’t want it.

To give you an idea, the vast majority of the code was written for ARM Cortex-A class processors to run on Windows Embedded or QNX. Some of this was library code that should be shared but much of it was project-specific code that would never be shared.

Recently, though I recommended against it, projects based on ARM Cortex-M class processors began to be added to the same repo. They would create their own targets and then do a massive amount of:

if(${PLATFORM_DEFINITION} STREQUAL MY_STM32_PROJECT all over the place to keep the CMake project from trying to configure things it didn’t need or want. The resulting CMake files all over in the library code was just getting cluttered with if/then/else statements and a bunch of flags.

The last STM32 project that got added took over a month just to add and the guy they asked to do it ended up quitting.

The only way to reduce the growing reliance on this unmanageable number of flags was to split things into the separate projects that they really are. My suggestion was separate repos and dependencies managed by FetchContent. I lost that fight and was told to do what I can with a single repo.

I don’t know if that describes the problem well enough. Hopefully it gives an idea. I don’t know if Discourse has private messaging. I could share one or two examples privately if you’re curious enough.

scivision · September 21, 2022, 2:33am

However, I see what’s missing in your example. You need to define in the top project where the externalproject libraries end up. You don’t have to use the install step, but I do as then I don’t have to introspect the ExternalProject BINARY_DIR property.

In this example, all the path names are arbitrary.

include(GNUInstallDirs)
include(ExternalProject)

ExternalProject_add(lib1
SOURCE_DIR ${PROJECT_SOURCE_DIR}/libs/lib1
CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${PROJECT_BINARY_DIR}
CONFIGURE_HANDLED_BY_BUILD true
)

# define ExternalProject targets so this project can use them
add_library(mylib1 INTERFACE)
target_link_libraries(mylib1 INTERFACE ${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_LIBDIR}/${CMAKE_STATIC_LIBRARY_PREFIX}mylib1${CMAKE_STATIC_LIBRARY_SUFFIX})
target_include_directories(mylib1 INTERFACE ${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_INCLUDEDIR})

add_executable(myapp main.c)
target_link_libraries(myapp PRIVATE mylib1)

I use this technique a lot with numerous users. I try to make it as programmatic as possible.
To avoid spilling files, I “install” to the build directory. You can make it a subdirectory or somewhere else if you want.

To handle BUILD_SHARED_LIBS or subprojects that have hard-coded shared, you can use if(BUILD_SHARED_LIBS) when defining INTERFACE libraries at the top level or also hard-code the suffix/prefix with static/shared similar to above.

To get the subproject isolation you want, ExternalProject is probably the better choice vs. FetchContent that will mingle the top-level scope into the subproject.

scivision · September 21, 2022, 2:43am

Another technique if you want to build those subprojects less often, but still keep a monorepo, is to make a special subdirectory with a standalone CMakeLists.txt that is just a bunch of ExternalProject_Add(). Then in the top-level project you would use find_library, find_path as in a general CMake project, except all the find_* have NO_DEFAULT_PATH option and HINTS to the CMAKE_INSTALL_PREFIX.

I use this technique in projects where the external libraries might take several minutes to build, but the top-level project consuming them only takes seconds to build say. Just to not have to build those less frequently changing libraries as often. However, the user has to be aware if external libraries (subprojects) change, they have to manually rebuild that special subdirectory CMakeLists.txt

Rich_von_Lehe · September 21, 2022, 2:52am

Wow, thanks for the great suggestions. It sounds like either FetchContent or ExternalProject should work for me. I’ll be back working on this again tomorrow and I’ll see if I get farther. I have a slight preference to using FetchContent since Craig suggested it should be possible. I’ll update if/when I get a small example to work.

scivision · September 21, 2022, 3:04am

Choosing FetchContent vs. ExternalProject:

maximum isolation between subprojects and top project: ExternalProject
source files available at configure time, use targets defined in subproject: FetchContent

FetchContent doesn’t require you to define interface libraries in the top-level project as ExternalProject does. With FetchContent you can just use the subprojects’ targets in the top-level project.

Rich_von_Lehe · September 21, 2022, 2:29pm

This snippet worked for me. Now I feel like I didn’t try hard enough the first time around, but in my defense there aren’t really many examples for the ‘no-download’ approach.

include(FetchContent)
FetchContent_Declare(
    lib1
    SOURCE_DIR C:/Projects/monorepo-demo/libs/lib1
)

FetchContent_MakeAvailable(lib1)

FetchContent_GetProperties(lib1
    SOURCE_DIR lib1_dir
    POPULATED  lib1_populated
)

message("lib1_dir ${lib1_dir}")
message("lib1_populated ${lib1_populated}")

Ultimately now I can set up a bunch of distinct top-level CMakeLists.txt files and each one can FetchContent the libraries that it needs located in other areas of the same repository. This should decouple many things and allow us to greatly reduce the reliance on project-specific flags that existed in the libraries themselves.

The libraries can themselves FetchContent their own dependencies. I’ve already proven to myself that can be done with a personal project I’ve worked on.

I’ll try to use a variable to point to the head of the repo and not use absolute paths of course.

Thanks to both of you for your responses!