`enable_language` option for Rust/Cargo

I’m starting to come across more projects that are blending Rust with C/C++, and unfortunately CMake integration leaves a bit to be desired. The language is shaping up to be one of the most common languages to link - would this make it a good next target for support with enable_language?

The current builtin solutions are pretty cumbersome - the best I have been able to do are some tricky setups with add_custom_command or ExternalProject_Add. corrosion is excellent and very full featured, but it seems like a builtin solution for basic use cases would be nice.

I’m not sure what this builtin solution would look like. Ideally there would be both an easy way to invoke rustc for building individual .a or .so/.dll files (similar to what is done for C, the linux kernel uses this style for their rust builds) and a way to invoke cargo to build packages (which handles parallel builds and caching on its own) - but I am unsure what would be most suitable.

1 Like

While I personally love Rust, it has a model that doesn’t map well to CMake. I’m not sure how to model Cargo-level dependencies for example since consumers get to pick features for the consumed project (due to the build-it-all-at-once model). I suspect custom commands with cargo and INTERFACE targets is best today.

Additionally, Rust doesn’t have a stable ABI, so .a and .so libraries aren’t all that useful outside of extern "C" crates (which I believe the Linux kernel is doing at some level, probably behind magic macros of some kind.

1 Like

(I totally missed this reply, sorry for the late response)

New Rust projects don’t usually use CMake since cargo is a build system, but cargo isn’t typically involved for existing C+CMake projects that add Rust. Instead, rustc will get invoked directly, which more or less does the same job as cc.

Rust by default of course doesn’t use the C ABI, but they just need to be defined (similar to C++). Usually this is automatic via bindgen. So typically, the mixed project stack looks like the following:

  1. C library exists
  2. CMake configures C
  3. “Rust calls C” interfaces are autogenerated via bindgen
  4. If needed, “C calls Rust” headers are generated via cbindgen
  5. Intermediate C libraries are compiled to .a/.so archives
  6. Intermediate Rust libraries are compiled to the same (--crate-type=staticlib to make a .a, --crate-type=cdylib to make a .so)
  7. Everything gets linked for the top-level binary

Out of these steps, the only one not supported by cmake is #6, which I was thinking enable_language may be able to do fairly easily. Would that be a possible add? I think this is about similar to how Fortran support works, correct?

in my original post I mentioned CMake calling Cargo since Cargo can also produce the linkable archives, in addition to handling dependencies (which rustc alone doesn’t do). But I think this is likely out of scope

C and CXX as well, but they are the default languages if nothing else is specified. Note that Fortran is much closer to C and C++ because it has the same one source → one object mapping. Rust is different in that the compiler only gets invoked on the “entry” path (one of lib.rs or main.rs) and other source files are discovered from there. That probably needs some different way to list the sources so that CMake knows which one is that entry point. There are also the .rlib artifacts that need transferred between Rust targets. I think it’s possible to do it this way, but it will be very alien to those expecting cargo-like workflows (and I still don’t know how to model using crates.io dependencies in CMake beyond “tell cargo to do it” at which point…what are we gaining by having CMake build the Rust code directly?).

Rust is different in that the compiler only gets invoked on the “entry” path (one of lib.rs or main.rs) and other source files are discovered from there.

I don’t think this is necessarily true. Rust has 2 ways to organize code:

  1. Modules: You don’t need to compile individual modules. rustc takes care of compiling all the modules as long as they map properly on the filesystem
  2. Crates: You actually need to compile crates individually and link them (by using --extern or -L flag in rustc)

A single file can also be considered as a crate. Here is a small example:

  • hello.rs:
extern crate useme;

fn main() {
    println!("Hello World");
    useme::useme();
}
  • useme.rs:
pub fn useme() {
    println!("Use Me");
}

The above program can be compiled as follows:

rustc useme.rs --crate-type lib
rustc hello.rs -L ./

I don’t really think Cmake support for Rust needs to do everything that cargo does. Rather, it should be limited to what you can do for C in standard Cmake.

Personally, I am interested in using Rust with Cmake in projects such as Zephyr where you will probably maintain a local copy of any external dependencies you want rather than fetching it from crates.io.

1 Like

Yes, this is what I meant when I said that only the “entry” path matters.

Right, the TU for Rust is the crate, not per-source.

The issue I see is “which file is what we should pass to the compiler?” in this case:

add_library(crate
  foo.rs
  bar.rs
  sys.c # low-level bindings or whatever
)

The issue I see is “which file is what we should pass to the compiler?” in this case:

I think you would only specify a single file in this case that is the entrypoint, and let rustc discover all the related files. Roughly:

# RLIB would be the default for rust
# this gets invoked as `rustc --crate-type=lib --crate-name=foo foosrc/lib1.rs`
add_library(foo, foosrc/lib.rs RLIB) 

# `rustc --crate-type=staticlib --crate-name=bar barsrc/lib.rs`
add_library(bar, barsrc/entry.rs STATIC)

# `rustc --crate-type=cdylib --crate-name=baz bazsrc/entry.rs`
add_library(baz, bazsrc/entry.rs SHARED)

# built as normal
add_library(syslib, sys.c STATIC)

add_executable(main bin/main.rs)
add_dependencies(main foo)
add_dependencies(main syslib)
add_dependencies(main bar)

rustc would recurse the files per its rules; e.g. if lib.rs calls out mod util;, rustc will automatically find foosrc/util.rs. It’s a bit different than C since all modules within a crate get essentially concatenated together as one unit of compilation, as opposed to every file being standalone. I am not sure how cargo detects what files are included to determine when to recompile.

I think the final rustc invocation would be about:

rustc --crate-type=bin \
    -lstatic:outdir/syslib.a \
    -lstatic:outdir/bar.a \
    --extern=foo=libfoo.rlib \
    bin/main.rs \
    -omain

This does let rustc handle the linking rather than cmake invoking it, which is probably fine. Any linker args just get passed as -C link=args='...'. Probably would need new variables CMAKE_RUST_FLAGS to set additional arguments and RUST_EDITION (like CXX_STANDARD) to emit --edition=2021 or similar. CMAKE_RUST_COMPILER_LAUNCHER would allow using sccache.

The final binary could be C as well, it would just only be able to link STATIC and not RLIB.

(cli docs for reference Command-line Arguments - The rustc book)

As long as it generates a suitable -MF-like output, that’s fine.

Alas, this makes IDE users sad as files not listed in CMake don’t (naturally) show up in the project sidebar.

I guess it does - not exactly the same as it omits std, but I think it has the needed information

$ tree rs-test/
rs-test/
├── bar.rs
├── entry.rs
└── foo
    └── mod.rs

1 directory, 3 files

$ cat rs-test/entry.rs
mod bar;
mod foo;

fn export() {
    foo::foo();
    bar::bar();
}

$ rustc rs-test/entry.rs --emit=dep-info -o entry.d

$ cat entry.d
entry.d: rs-test/entry.rs rs-test/bar.rs rs-test/foo/mod.rs

rs-test/entry.rs:
rs-test/bar.rs:
rs-test/foo/mod.rs:

That’s an interesting point, I am not too familiar with this option. Does this usually go via cmake-file-api, and does that only report the files enumerated in add_x? If it is via the API, could the result potentially include files from the dep-info produced above? The command seems fairly robust and will parse all syntactically correct files to build the module tree, though does stop recursing if it meets an incorrect file.

Requiring the user to specify all files is also possible of course, but seems a bit redundant since that information is well-defined. Users that need Rust support probably also have IDEs that can get the file list from rust-analyzer (used by vscode, vim, qtcreator, etc) or the intellij plugin (used by clion and jetbrains).

The file API is configure time while the dep info is build time. It’s not available early enough.

I feel that I would want even cfg-gated sources to be listed in the IDE.

I think letting rustc handle the linking is only required for rlib and proc-macro. All the other crate types and C files can be linked after rustc step just like in C.

Most IDE support still uses rust-analyzer, which can be configured for non-cargo projects. So maybe we can generate it?

I was thinking that rustc could handle the linking only if the top-level binary is Rust, but maybe it’s possible to do it the normal way. You can --emit=obj to get a .o, but figuring out how to link libstd.rlib is trickier. I’m curious how the new buck does it

Well, for linking with rlib, I think we will need to use rustc. There is a Pre-RFC: Stabilize a version of the rlib format, but I haven’t tried it out yet.

It is also possible to use standard linkers with rlib with some hacks but well, as I said, they are kind of hacks and can break.

Yeah, I just meant that most final Rust binaries need to link std which is a rlib, so your final link more or less has to go through rustc if you’re using Rust in the top-level binary. Everything else is more or less unstable.

But that will eventually change, it’s just not there yet. Maybe add_executable should be forbidden for Rust (at least for now) unless something like RUST_USE_RUSTC_AS_EXECUTABLE_LINKER is set. Then, whenever rust adds better support for native linking, support for add_executable without the flag can be added.

Shared libraries will also need to bring it in (e.g., a Python module). As for RUST_USE_RUSTC_AS_EXECUTABLE_LINKER, that is the LINKER_LANGUAGE property.

1 Like

I have been experimenting with different possibilities of adding Rust support. I can build very simple projects with cmake. However, the same project fails when I try to build it using ctest. The error I am getting is as follows:

Start testing: Oct 11 23:50 IST
----------------------------------------------------------
265/627 Testing: SimpleRustOnly
265/627 Test: SimpleRustOnly
Command: "/var/home/ayush/Documents/Programming/Cmake/rust/build/bin/ctest" "--build-and-test" "/var/home/ayush/Documents/Programming/Cmake/rust/Tests/SimpleRustOnly" "/var/home/ayush/Documents/Programming/Cmake/rust/build/Tests/SimpleRustOnly_rustc" "--build-generator" "Unix Makefiles" "--build-makeprogram" "/usr/bin/gmake"
Directory: /var/home/ayush/Documents/Programming/Cmake/rust/build/Tests
"SimpleRustOnly" start time: Oct 11 23:50 IST
Output:
----------------------------------------------------------
Internal cmake changing into directory: /var/home/ayush/Documents/Programming/Cmake/rust/build/Tests/SimpleRustOnly_rustc
======== CMake output     ======
Configuring done (0.0s)
Generating done (0.0s)
Build files have been written to: /var/home/ayush/Documents/Programming/Cmake/rust/build/Tests/SimpleRustOnly_rustc
======== End CMake output ======
Change Dir: '/var/home/ayush/Documents/Programming/Cmake/rust/build/Tests/SimpleRustOnly_rustc'

Run Clean Command: /usr/bin/gmake -f Makefile clean

Run Build Command(s): /usr/bin/gmake -f Makefile
[ 50%] Building Rust object CMakeFiles/SimpleRust.dir/main.rs
[100%] Linking Rust executable SimpleRust
error: toolchain 'stable-x86_64-unknown-linux-gnu' is not installed
gmake[2]: *** [CMakeFiles/SimpleRust.dir/build.make:88: SimpleRust] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/SimpleRust.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2

<end of output>
Test time =   0.11 sec
----------------------------------------------------------
Test Failed.
"SimpleRustOnly" end time: Oct 11 23:50 IST
"SimpleRustOnly" time elapsed: 00:00:00
----------------------------------------------------------

End testing: Oct 11 23:50 IST

Maybe ctest uses some sort of isolation which causes the problem? I can go to the test directory and run make just fine. Maybe there is something wrong with my test config?

1 Like

It seems like rustup’s environment is missing. CTest doesn’t isolate anything environment-wise at least. Is the testing run in the same environment as the build?

Well, I tried specifying, target and sysroot manually as well, but it doesn’t seem to work. The environment is the same, and the command I am using is ctest -R SimpleRustOnly.
Could it be because of some CMake variable not being set? the Modules/CMakeRustCompiler.cmake.in is still incomplete (although it can build).

There isn’t a chance that you have the nightly toolchain but not stable or something like that, is there? That name shows up in rustup toolchain list / in ~/.rustup/toolchains/?

Maybe make sure it is looking in the right place, e.g. if the test is run under a different user I could see it getting confused.