CMake, Ninja, Fortran, and inconsistent `Cannot open module file` errors

Hello! I’m running into an issue where, for some system configurations, I encounter a Cannot open module file (for a Fortran module file) during the build. From what I can tell, this is occurring because the way I’ve set up my CMake code and the related the implicit building of the Fortran module files results in Ninja attempting to use the module files before they’ve been built. However, I’ve not particularly knowledgeable about any of these components, so I may be wrong. This project is refurbishing a legacy Fortran codebase.

Notably, encountering the error seems somewhat inconsistent, and I’m having difficulties reducing the problem to a smaller minimal working example. As such, I apologize for this large MWE, but I’ve been unable to shrink it further while still representing the issue. I also cannot reproduce the error on my local machine. I’m using cibuildwheel, a tool to run the build on various CI runners (e.g., GitHub Actions) to produce precompiled binaries for various systems. I only encounter the issue on some CI runner configurations. I’ve found a few tangentially related issues where people have encountered difficulties with Fortran module files and the Ninja build system. Also, if I switch CMake from using Ninja to using Unix Makefiles, I don’t seem to encounter the issue.

For my actual set up, I’m using CMake to build both a shared library and an executable. For components used by both, I create OBJECT libraries. It is one of the OBJECT libraries whose mod files seem to have the issue when used as a component in building the executable. Concretely, in the below CMake code, eesunhong_recipes_replacements is the OBJECT library whose mod file is not found when building the eesunhong_main executable.

I’m guessing what I would need to do is enforce that eesunhong_recipes_replacements produces its mod files before anything in eesunhong_main is attempted to be built. However, I’ve tried various solutions such as add_dependencies(eesunhong_main eesunhong_recipes_replacements) without success. Of course, the issue might be something else entirely. For example, the build location of the Fortran module files. I’ve attempted various uses of Fortran_MODULE_DIRECTORY to specify the module file location and add include that in later build steps, but also haven’t had any success there either. Does anyone have suggestions as to how I can resolve this? Does it seem I’m on the right track, that it’s happening because of the build order from Ninja and the implicit building of the Fortran module files? If so, is there a way to enforce the order for those module files? Any advice would be greatly appreciated. Thank you so much for your time.

CMake

cmake_minimum_required(VERSION 3.17)

project(src/eesunhong LANGUAGES Fortran)

set(CMAKE_POSITION_INDEPENDENT_CODE ON)

# Build the Fortran standard library.
set(BUILD_TESTING OFF)
include(FetchContent)
FetchContent_Declare(
        fortran_stdlib
        GIT_REPOSITORY https://github.com/fortran-lang/stdlib.git
        GIT_TAG df1e2f0ed0cbe2fbd9c1f20dcb5bd1a4bba95bb2  # v3.0.0
)
FetchContent_MakeAvailable(fortran_stdlib)
set(BUILD_TESTING ON)

# Create the object libraries to be used in both static and shared libraries.
add_library(polyroots OBJECT third_party/polyroots-fortran/polyroots_cmplx_roots_gen.f90)
add_library(roots OBJECT third_party/roots-fortran/root_module.F90)
add_library(eesunhong_recipes_replacements OBJECT src/eesunhong_recipes_replacements.f90)
target_link_libraries(eesunhong_recipes_replacements PUBLIC fortran_stdlib)

# Create the shared library.
add_library(eesunhong_fortran_library SHARED $<TARGET_OBJECTS:eesunhong_recipes_replacements> $<TARGET_OBJECTS:roots>)
target_link_libraries(eesunhong_fortran_library PUBLIC fortran_stdlib)

# Create the static version of the object libraries.
add_library(eesunhong_complete_static STATIC $<TARGET_OBJECTS:eesunhong_recipes_replacements> $<TARGET_OBJECTS:polyroots> $<TARGET_OBJECTS:roots>)
target_link_libraries(eesunhong_complete_static PUBLIC fortran_stdlib)

# Create the main executable.
add_executable(eesunhong_main src/main.f third_party/minuit/minuit_94a_dblb.f src/fcnrvg4_Ctpar.f src/bilens.f src/critical.f
        src/microcurve_rvg4Ctpar.f src/hexadec_only.f src/geo_par.f src/eesunhong_real_complex_conversion.f90)
target_link_libraries(eesunhong_main PUBLIC eesunhong_complete_static)

# Install
if(SKBUILD)
    set(library_directory "${SKBUILD_PLATLIB_DIR}")
    set(binary_directory "${SKBUILD_PLATLIB_DIR}")
else()  # Calling CMake directly instead of through scikit-build means this is a developer build.
    set(library_directory ".")
    set(binary_directory ".")
    set(CMAKE_INSTALL_PREFIX .)
endif()

install(TARGETS eesunhong_fortran_library DESTINATION ${library_directory})
install(TARGETS eesunhong_main DESTINATION ${binary_directory})

Error

The full output of the build can be seen at this CI run (after clicking the dropdown for “Building wheel…” inside the output window): Example state for discourse question · golmschenk/eesunhong@7120cda · GitHub

The snippet of the main error is:

[179/241] Building Fortran object CMakeFiles/eesunhong_main.dir/src/microcurve_rvg4Ctpar.f.o
    FAILED: CMakeFiles/eesunhong_main.dir/src/microcurve_rvg4Ctpar.f.o
    /opt/rh/devtoolset-10/root/usr/bin/gfortran -I/project/src -I/project/build/_deps/fortran_stdlib-build/src/mod_files -O3 -DNDEBUG -O3 -fPIE -fpreprocessed -c CMakeFiles/eesunhong_main.dir/src/microcurve_rvg4Ctpar.f-pp.f -o CMakeFiles/eesunhong_main.dir/src/microcurve_rvg4Ctpar.f.o

...

    /project/src/microcurve_rvg4Ctpar.f:2537:11:
  
     2537 |        use eesunhong_recipes_replacements,
          |           1
    Fatal Error: Cannot open module file ‘eesunhong_recipes_replacements.mod’ for reading at (1): No such file or directory
    compilation terminated.

I would check which version of CMake is being used.

Ninja generation was broken for Fortran in CMake 3.27.0 through 3.27.4.

So if you’re using Cmake 3.26 or older, or CMake 3.27.5+ this could be a CMake bug or a project issue.

1 Like

Thank you much. It appears the build is using CMake 3.27.6 (and Ninja 1.11.1.1), so I suppose the bug you’re referring to is not the same (unless the “about 3.27.5” is fuzzy, and 3.27.6 could still be in that range).

However, if I add the requirement of either CMake 3.27.6 or 3.27.7, I’m now able to reproduce the error on my local machine. So that helps. It seems my build works fine (at least on my local machine) on 3.26, but not 3.27 (including 3.27.6 and 3.27.7).

Hmmm. The error still only occurs through cibuildwheel, and not through a manual call to CMake though, even with both using CMake 3.27.7. That said, I should stop giving every update here and wait until I’ve explore the configuration a bit further. Thanks again.

My mistake–Ninja Fortran generation was fixed in 3.27.5 https://gitlab.kitware.com/cmake/cmake/-/merge_requests/8772
and broken from 3.27.0…3.27.4

1 Like

As a temporary fix, I’ve forced the build order by adding:

add_custom_target(dummy_to_force_build_order)
add_dependencies(dummy_to_force_build_order eesunhong_recipes_replacements)

and

add_dependencies(eesunhong_complete_static dummy_to_force_build_order)

Which forces the OBJECT library to be built before the STATIC library.

I’ll explore the underlying cause in the near future. For the moment, it is not cibuildwheel itself that causes the problematic CMake configuration, but scikit-build-core, a package that connects the Python build system to CMake. Calling the scikit-build-core build directly through pipx run build encounters the same issue. Either the environment scikit-build-core sets up or the configuration it uses for CMake is failing in this case. I don’t know whether this is a scikit-build-core issue or CMake issue yet, but I’ll report back once I find out. Thanks again!

This bug and associated fix might help. I didn’t try it.

https://gitlab.kitware.com/cmake/cmake/-/issues/25365

1 Like

Thank you! Looks promising. I will try it out when I get a chance.

Just wanted to follow up, the issue linked above appears to have its fixes applied to version 3.27.9. And I no longer have the issue I mentioned above on this version. Thank you much to you and the CMake developers!

1 Like

Basically they reverted to the prior strategy for 3.27.9. I believe in the 3.28 release series will be a fixed new approach.

1 Like