Need CMake M Plugin help: Single target, Single source, multiple objects, multiples libraries

shabiel · March 12, 2021, 4:40pm

Hello everybody,

Some of you may remember me. I work with YottaDB now on the M compiler.

We have a longstanding issue with our CMake toolchain located here: YottaDB / Tools / YDBCMake · GitLab. Let’s start with a quick introduction:

Our compiler needs to produce 1 or 2 objects. 1 is an ASCII mode object (for legacy applications), and 2 objects is 1 ASCII and 1 UTF8. When we have two objects, the install destination for each object is different (the utf8 one goes into a utf8 folder). Whether 1 or 2 objects is produced depends on whether libicu is installed and if YottaDB has a UTF-8 installation (it’s optional).

Currently our toolchain is designed so that you cannot compile both objects at the same time. Rather, you have to run cmake again with -DM_UTF8_MODE in order for it to produce a different object. That’s the theory anyways… not sure that it even works right now.

I don’t know much about the internals of CMake, and googling around for a tutorial for making a new language is not yielding much. So I sure could hope for some help.

To help you: I have determined that ydbcmake/CMakeMInformation.cmake · master · YottaDB / Tools / YDBCMake · GitLab is the main file that sets the compiler. Take a look. I think what I want to do is:

Change CMAKE_M_COMIPLE_OBJECT to produce two objects (problem right now: they overwrite each other!)

set(CMAKE_M_COMPILE_OBJECT
    "ydb_chset=\"M\" <CMAKE_M_COMPILER> -object=<OBJECT> ${COMPILE_FLAGS} <SOURCE>"
    "LC_ALL= ydb_chset=\"utf-8\" ydb_icu_version=\"${icu_version}\" <CMAKE_M_COMPILER> -object=utf8<OBJECT> <SOURCE>")

Change CMAKE_M_CREATE_SHARED_LIBRARY and CMAKE_M_CREATE_SHARED_MODULE to create two separate modules.
Somehow the install needs to happen twice, once into the target location, and once into the target/utf8 location.

TIA,

–Sam

kyle.edwards · March 29, 2021, 1:27pm

Does your compiler produce two objects in a single invocation, or are they coming from separate invocations? Are they from the same source file, or different source files? is -DM_UTF8_MODE a CMake flag, or a compiler flag?

shabiel · March 29, 2021, 1:44pm

Hello Kyle.

Answers to your questions:

Does your compiler produce two objects in a single invocation, or are they coming from separate invocations?

That’s the core of our problem. We invoke it twice, not once. The invocations are with different environment variables. We want to invoke cmake/make once though, and that’s the core of my question.

Are they from the same source file, or different source files?

Same source file.

is -DM_UTF8_MODE a CMake flag, or a compiler flag?

CMake Flag. The compiler uses $LC_ALL and $ydb_chset env variables in order to figure out which object to output.

–Sam

kyle.edwards · March 29, 2021, 1:48pm

If the compiler is being invoked twice, then I would suggest creating two object libraries that both compile the same source file but with different flags. Would that work for your use case?

shabiel · March 29, 2021, 1:57pm

I tried that, but that doesn’t work. This line gets run again, and overrides the original flags, so in the end you either compile two M objects or two UTF-8 objects.

set(CMAKE_M_COMPILE_OBJECT "LC_ALL=\"${LC_ALL}\" ydb_chset=\"${ydb_chset}\" ydb_icu_version=\"${icu_version}\" <CMAKE_M_COMPILER> -object=<OBJECT>")

kyle.edwards · March 29, 2021, 2:04pm

Is there a way to make your compiler use ASCII or UTF-8 depending on a command line flag instead of an environment variable?

If not, I would suggest creating a “superbuild” CMake project that calls a smaller subproject twice, once with the ASCII flags and once with the UTF-8 flags.

shabiel · March 29, 2021, 2:26pm

Is there a way to make your compiler use ASCII or UTF-8 depending on a command line flag instead of an environment variable?

No. Long history and existing customers and upstream code bases… we can’t change that.

If not, I would suggest creating a “superbuild” CMake project that calls a smaller subproject twice, once with the ASCII flags and once with the UTF-8 flags.

That sounds like it may work. Can you reference an example where this is done?

kyle.edwards · March 29, 2021, 2:39pm

Unfortunately no, but I believe FetchContent is what you’re looking for. @craig.scott can help you get started with this.

craig.scott · April 1, 2021, 4:09am

I suspect ExternalProject is the better fit here. You want to control the environment seen by the compiler, but FetchContent can’t do that for you. If you use ExternalProject, you can specify the build command in a way that sets or modifies the environment at build time. Sketching out a skeleton of the essential bits:

set(src /path/to/subproject)   # See below

include(ExternalProject)
ExternalProject_Add(variant_M
    SOURCE_DIR ${src}
    BUILD_COMMAND ${CMAKE_COMMAND} -E env ydb_chset="M"   # See below
                    ${CMAKE_COMMAND} --build <BUILD_DIR>
)
ExternalProject_Add(variant_utf8
    SOURCE_DIR ${src}
    BUILD_COMMAND ${CMAKE_COMMAND} -E env ydb_chset="utf-8"   # See below
                    ${CMAKE_COMMAND} --build <BUILD_DIR>
)

The directory pointed to by /path/to/subproject would need to be able to be built as a standalone CMake project. It could be a subdirectory within your source tree which may be considered the “meat” of the main project, with the main project effectively just being a wrapper around these two ExternalProject_Add() calls.

The BUILD_COMMAND lines can define whatever environment variables you need. I’ve just shown it setting ydb_chset as an example, but add whatever key=value items you require.

If you need to pass compiler definitions as well, you could use something like the CMAKE_ARGS keyword in the calls to ExternalProject_Add() to achieve that. Read up in the ExternalProject module documentation to see how to do that.

I haven’t addressed the question of how to combine the results of the two ExternalProject_Add() calls. That is in part because I feel like this is all heading down the wrong path for what you ultimately want to achieve. It feels overly complex, but I can’t offer an alternative solution.

Perhaps you might be able to use a wrapper script to hide the details of this from CMake? Your wrapper script could take care of adjusting the environment settings and adding some extra compiler flags to the compile line before passing it along to the real compiler. Take a look at the <LANG>_COMPILER_LAUNCHER target property and its associated CMAKE_<LANG>_COMPILER_LAUNCHER variable for doing that. I’ll have to leave you to experiment with whether you can make that work.

shabiel · April 1, 2021, 2:52pm

Thank you Craig. Will take a while for us to digest this.

–Sam

shabiel · April 9, 2021, 7:22pm

Kyle,

Because of the complexity of the other solutions, we are entertaining passing flags to the compiler (something to be developed in the future) instead of using environment variables.

What’s the best way for us to create the two objects from a single cmake pass? I am thinking of setting CMAKE_M_COMPILE_OBJECT to run two commands rather than one.

–Sam

shabiel · August 25, 2022, 2:46pm

Hello everybody,

More than a year later, I am happy to report that I have a resolution.

@craig.scott I tried the ExternalProject paradigm, and it works well, but it was too complex for my taste. PS: I have your CMake book; you are a very good writer!

I contacted Brad, and a couple of email exchanges later, he guided to me to the solution:

Use target_compile_options() to pass <FLAGS> to CMAKE_M_COMPILE_OBJECT.
Use a macro or function for outside users to use this functionality which will create both objects and both install rules at the same time.

The new CMAKE_M_COMPILE_OBJECT looks like this: set(CMAKE_M_COMPILE_OBJECT "LC_ALL=C.utf-8 <FLAGS> <CMAKE_M_COMPILER> -object=<OBJECT>")

Here’s the function:

function(add_ydb_library library_name)
        set(flags)
        set(args)
        set(listArgs SOURCES)
        cmake_parse_arguments(arg "${flags}" "${args}" "${listArgs}" ${ARGN})

        if (NOT arg_SOURCES)
                message(FATAL_ERROR "[add_ydb_library]: SOURCES is a required argument")
        endif()
        if (SOURCES IN_LIST arg_KEYWORDS_MISSING_VALUES)
                message(FATAL_ERROR "[add_ydb_library]: SOURCES requires at least one value")
        endif()
        add_library(${library_name}M SHARED ${arg_SOURCES})
        target_compile_options(${library_name}M PRIVATE ydb_chset=M ydb_icu_version=)
        set_target_properties(${library_name}M PROPERTIES PREFIX "")
        set_target_properties(${library_name}M PROPERTIES LIBRARY_OUTPUT_NAME ${library_name})
        set_target_properties(${library_name}M PROPERTIES LIBRARY_OUTPUT_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})
        if(ydb_icu_version)
                add_library(${library_name}utf8 SHARED ${arg_SOURCES})
                target_compile_options(${library_name}utf8 PRIVATE ydb_chset=utf-8 ydb_icu_version=${ydb_icu_version})
                set_target_properties(${library_name}utf8 PROPERTIES PREFIX "")
                set_target_properties(${library_name}utf8 PROPERTIES LIBRARY_OUTPUT_NAME ${library_name})
                set_target_properties(${library_name}utf8 PROPERTIES LIBRARY_OUTPUT_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/utf8)
        endif()
endfunction()

craig.scott · August 25, 2022, 9:43pm

Ah, nice. That’s much cleaner than going down the ExternalProject route. So clients of the library then have to choose which of the two libraries they link against? I guess that’s clear enough and makes explicit whether they are using the old legacy library or the new utf-8 one. My initial reaction would be “I asked to create library <XXX>, but I got <XXX>M and <XXX>utf8”. But I think once you understood there are two libraries created, that would be easy enough to adjust to.

shabiel · August 26, 2022, 1:53pm

Hello Craig,

Actually, the .so files are the end products; and we don’t expect other people to link to them; but the people writing CMake scripts may need to depend on <XXX>M and <XXX>utf8. This actually caught me until I realized what I did.

Thank you for everybody’s help.

–Sam