Compiling Shared Library Using External Library, and then Using it on another system.

Right now I’m trying to compile a project using an SDK from a service on system A. Currently my shared library compiles fine with this SDK, but when I try to move this library to use on system B, I’m noticing now that it’s causing crashes with the program I’m trying to use it with.

For further context:
System A:

  • uses packages a & b from the SDK

System B:

  • uses another shared library with packages b & c from the SDK
  • I want to avoid having to include package a on this system

Right now I’m seeing 2 different crashes at consistent points.

  1. Crashes in the code that never crashed before that are completely unrelated to the SDK what so ever. - And when I recompile the library without the use of the SDK, that code path functions 100% fine again.
  2. Crashes that show the stack related to package c which I have not touched at all (and again this package I do not include in system A during build as it is not needed in my library).

Right now I’m trying to figure out why this new library would be causing crashes to my program. When I try to isolate the library using another program, it works fine. No crashes and functions as expected. But for some reason it looks like when I try to include this library with this existing program, it causes crashes related to the SDK, and even not related to the SDK. I think maybe it’s related with how I am compiling things, or potentially a clash in the libraries? any insight is very appreciated!

This is probably not related to CMake at all.
I’d suspect that somehow you use 2 different versions of some libraries (maybe the SDK itself) on those systems and the libraries are not properly versioned (SONAME), which leads to ABI incompatibility.

Your code may be working by coincidence, but it may already have symbols from the SDK, which may be incompatible with the version you have on system B. Or something more complicated.

Check all te versions first. Also check the output of ldd for any libraries without version suffixes, because those can also be the culprit.

Given what you’ve posted, I’ve now aligned the version used from the SDKs and I’ve checked the library version suffixes, but I’m still seeing this issue.

From here I want to try to compile my libraries and have the shared SDK library b linked and then when I move the compiled library to system B relink it to the library b on that system. However when I check ldd I don’t see library b linked at all.

Strangely enough, the other libraries that I did specify I do see linked under ldd. For context my cmake file I’m using is something along the lines of:

find_package(sdk_lib_a CONFIG REQUIRED)
find_package(sdk_lib_b CONFIG REQUIRED)
add_compile_options(-g -fPIC -fno-strict-aliasing -DNDEBUG -DSQLUNIX -D_REENTRANT -DOPENSSL_1_1_0 -DCRYPT_OPENSSL_SUPPORTED
add_library(lib_name SHARED source_files)
target_link_libraries(lib_name PRIVATE lib_z, lib_x, lib_y, sdk_lib_a, sdk_lib_b)

and then my output for ldd would be something along the lines

lib_z => ...
lib_x => ...
lib_y => ...
...
other_libs

with an absence of entries for `sdk_lib_a` and `sdk_lib_b`.

I’ve tried using PUBLIC as well, and with -DBUILD_SHARED_LIBS=OFF and ON. I also have -DMAKE_BUILD_TYPE=Release, but I don’t think that would have any effect on things. Any output for how to achieve this would be greatly appreciated!

If you are compiling on one system and moving the binary to another system where ldd fails to find your libraries, you either compiled on a different architecture ARM vs. x86? or you do not have the libraries properly installed.

Look at this post.in

Ass-U-Me-ing you installed your libraries in standard system locations either you or the installer failed to run ldconfig to update known system libraries.

The other place people wizzy-puffle this up is installing libraries in non-standard locations. They Ass-U-Me it will “just work.” No, it won’t.

/etc/ld.so.conf.d/

You have to have a file there containing the path to the library prior to running ldconfig under sudo or as root.

This is not a CMake problem. This is building on one system and trying to run a non-packaged binary on a different system.

Take a good look at that control.in file. The reason people endure the pain of creating a debian (or RPM) package for their own binaries is for exactly this reason. The package won’t install unless all of the dependencies are properly installed.

Don’t build an executable on one system and copy it to another expecting it to “just work.”

If you aren’t running on a Linux distro that forced out new Fuse libraries (glaring at you Ubuntu!) you can build an AppImage. I always build AppImages from my .deb packages though. Look at RedDiamond project for how to go about doing that.

Hi, it’s not that the other system can’t find the library when I run ldd, it’s that when I run ldd, it doesn’t look like the library is in the list when run ldd.

i.e: I have library lib_z, lib_y and lib_a, and when I run ldd I see

lib_z => <path to lib_z>
lib_y => <path to lib_y

- end of output -

And I don’t see any entries for lib_a.

What I want is during compilation, lib_a to be dynamically linked, so that I can move it onto the other system so it can find the library there.

However, I’m unsure if it’s because of my cmake settings or it has to do something with how I’ve installed the SDK (using vcpkg) which is not linking the package dynamically.

note: idk if my use of dynamic linking is 100% proper here as I’m new to this so sorry if it’s being used in the wrong context and is causing confusion.

Let’s go a step back maybe. Are you sure that find_package finds the dynamic libraries and not the static ones?
Can you share the generated compilation/link lines? Maybe the answer is visible there.

Okay, I’m reading through this again. Your description has some fundamental problems.

Notably, if you don’t have sdk_lib_1 on System B, it cannot actually be linked.

Secondary libraries do not generally show up with ldd. What I mean is if lib_a is used by lib_z and lib_y but not directly used by the main program, it may not show up.

So. Using these generic names, without seeing actual code, etc. will make this impossible to troubleshoot. Let’s start with the fundamentals.

What operating system AND VERSION are running on System A and System B. Don’t just say “linux.” Be specific.

Get yourself a copy of this book. It’s available from Barnes & Noble. Install and configure Emacs the way it says. Don’t just phi-slamma-jamma an install of your own. There are some configuration steps as well as some Elpa and Melpa packages that help with this. You need to read the debugging section twice. No, I’m not going to cut & paste it here for you.

Once you have a proper Emacs installation and configuration, read these two posts.

The exact linux distro must be known. I don’t cover every distro in the “finding post.” If you are running something off-the-wall, you are going to have to ask the keepers of the dark lore for that distro how to enable dump files. You also need to compile and install everything for Debug.

I run into this a lot with Noobs and people that didn’t go to college for Software Engineering. Not trying to put you down, just trying to explain. Random crashes require dump files and a debugger that will work with dump files. Period. There is no quick fix and nothing can be cut & pasted from StackOverflow to solve the problem.

I suspect the reason we are using generic names and that you don’t want to just install the exact same libraries on both systems is the fact it costs money for one or more of the libraries. Perhaps it is even Qt which has an “everybody must buy a license” definition of OpenSource? If so, this story will be even more relevant.

Story:

Many years ago, a pimp I liked working with presented me to a new client of theirs. Some SOC/SOM manufacturer wanting to basically create a serial terminal program for the toolkit of their shiny new dev board. I had done many Qt contracts, mostly medical devices, for said pimp so they knew my skill level.

During the phone interview I basically told the MBA running the group to go pound sand.

They wanted to use “one thing” from Qt. His “lead developer” had a serial port library he “had used successfully in other Windows programs” and there were a couple of other libraries they were trying to mesh together.

I told them they couldn’t “sprinkle in” a pinch of Qt like salt into soup. They told me I was wrong. I told them it was an application framework. The startup had global data and the entire library needed the Main Event Loop created by QApplication. Their Windows serial port library was using the Windows Main Event Loop not the Qt Main Event Loop. They would have to write a Qt application and use the QSerialPort library from the playground. (I don’t think Qt officially had a serial port class then.)

Keep in mind I hadn’t seen any of their code. No, I didn’t take the contract. You don’t ask for help then tell me I’m wrong when I tell you how to fix it. If you knew how to do it you wouldn’t be asking for help.

What happens with those new to software development or who didn’t get a Software Engineering degree is the C/C++ standard libraries lull them into a sense of “I can pick and choose what I want.” With the compiler provided libraries, this is true . . . mostly.

All third party application frameworks have global data that must be initialized by the library startup code. Most of them will have internal structures like their Main Event Loop or some communications area as well. When you pull a class/function/piece out of such a library you may be “lucky” enough for it to bring the global data strctures along for the ride, but you have bypassed the initialization. Random crashes happen because there will be a pointer with a bizarre value in that global data.

I suspect when you compile everything in Debug, install it, the random crashes will be dramatically reduced if not disappear entirely. Why? Most debug compilations of C/C++ code initialize pointers to NULL. This means they won’t have a bizarre value pointing to the middle of the RAID firmware or past the end of physical memory. The code itself probably defends against NULL, but not uninitialized.

You can test that while you are waiting for your book to arrive and learning how to use GDB.

You aren’t blending in QML or some other JavaScript like engine are you? If so read this

I pray you aren’t trying to use “Smart Pointers.”

I currently believe your focus on ldd output is a Red Herring. Your project has a much more fundamental problem. These steps should help you find it.