Partial linking

Hi,

gcc (and probably other compilers) support the concept of “partial linking” (see the -r switch). While admittedly kind of weird, it appears quite handy when you want to conceal internals from a user of a library.

Let’s say you have a library L with an official (public) interface. This library is made of several object files O1, …, On and uses additional code implemented in other libraries S1, …, Sm which are part of the project as well.

In the normal case, a user would have to get all these libraries (L and S1, …, Sm) together with the header files containing the public interface of L. However, all the symbols related to internal interfaces (for instance those of S1, …, Sm) are still visible to the user and might reveal details which are for different reasons not intended for the eyes of the user.

This conflict can be avoided by partial linking all the object files together with all needed (static) libraries into one large object file which no longer contains unresolved symbols and which can in a later step be cleaned of all symbols not needed for relocation while obfuscating the remaining internal symbols keeping only the official symbols in place.

As this process needs to pull in all libraries we depend on, one would have to use something like add_executable() (with appropriate link flags for partial linking etc) rather than add_library() to create the large object file. However, my impression is that this is kind of similar to fixing a screw with a hammer. (Furthermore, the resulting object file would always be named with an .exe suffix when building for windows.)

Is there some recommended way to do this more or less cleanly using CMake?

Kind regards
Ingolf

CMake’s model for this is OBJECT libraries. Basically the “library” is just a collection of .o files tossed on (direct) consumer’s link lines. I’m not sure of the best reference for it, but the docs should be a good start at least.

Note that not every platform supports partial linking in that way (IIRC, Windows probably gets confused if you don’t synchronize _EXPORT symbols between will-be-colocated targets), so it’d be hard to model in CMake.

@ben.boeckel, thank you for this suggestion. However, I do not yet see how OBJECT libraries will solve the situation: my understanding of the OBJECT library concept is that those libraries will contain a bunch of individual object files. These object files will still reveal internal symbols which need to be present when finally linking an executable with this library.

My goal was to have the complete functionality provided in one large object file which can then be handed over to a customer for integration in his executable (and maybe the customer does not even use CMake). That object file would surely have to define symbols related with its public interface, but all those internal symbols could be eliminated (or at least obfuscated if needed for relocation purposes).

Even if this concept cannot easily be modeled using CMake, is it possible to suppress the .exe suffix from the actual file name generated by add_executable() when CMAKE_SYSTEM_NAME is Windows? I tried setting CMAKE_EXECUTABLE_SUFFIX to the empty string, but apparently this variable is used for reporting purposes, and modifications of that variable have no effect.

Kind regards
Ingolf

At least for C++, the visible symbols of a shared object / DLL can be controlled. For gcc, there is -fvisibility=hidden. The library developer then explicitly exports symbols.

Not sure if that is applicable to C, though.

@hsattler, I believe that “visibility” in the “shared object / DLL” context refers to whether a certain symbol is available with “external visibility” property allowing you to look it up for linking purposes.

In my case, I’m intentionally dealing with static linking. Furthermore, I’d like to make sure that the non-public symbols are indeed eliminated / obfuscated, i.e. no longer reported by nm or similar tools (or hex dumps).

Kind regards
Ingolf

Yes, anything that is publicly linkable will need to have a predictable name. Using -fvisibility=hidden will prevent direct usage of any symbols not explicitly exported. I think you’re also interested in some kind of obsfucation for which I suspect there are tools that can post-process your library to do that with its internal symbols. But just being hidden should prevent direct usage from outside the library as well.

This is a separate question, but it sounds like an X/Y problem. What are you trying to achieve with such a change?

Library visibility is a platform thing; any ELF-targeting language can (well, should be able to) do it.

CMake has nothing to do with this; find an ELF obsfuscator tool to post-process your binary (probably with a POST_BUILD rule).

This is a separate question, but it sounds like an X/Y problem. What are you trying to achieve with such a change?

It is related to the problem in that add_executable() with appropriate flags seems to do the initial step of creating one large object file from the collection of individual small object files and required (static) libraries. What I’d like to achieve is inhibiting the .exe suffix which is appended when building for Windows – after all, the result does not resemble an executable.

An alternative question: is it possible to access and use individual “building blocks” of the add_executable() implementation (like for example collecting libraries, evaluation of properties, invoking the compiler/linker, …) in a way which allows something like "do the very same as add_executable() does but use file name XYZ instead of <name>.exe"?

CMake has nothing to do with this; find an ELF obsfuscator tool to post-process your binary (probably with a POST_BUILD rule).

Seconded. The hard thing seems to be to get the one large object file. (Recall that I do not deliver a complete executable and would very much like to avoid having to deliver a bunch of individual libraries.)

Kind regards
Ingolf

Then don’t use add_executable maybe? That makes executables. You want…something else.

You’re calling it “one large object file” but I really doubt you want that; instead you want a static library (which is just a container of objects) that is amassed from other static libraries. What you want sounds like this, but it can be approximated with OBJECT libraries being linked into a final STATIC library that you then distribute.