0

Background

We have a c++ library made for communicating with devices. 1 library can communicate with only 1 device. This library also creates a log file when it is inited. We now have a new project that needs to communicate to 30 devices at once.

Problem

This new project is in c++20 and will run on linux, we want to be able to load 30 of these libraries into the program so i have made 30 copies of this .so file what has no hard linking and has its names increment like mylib_0.so, mylib_1.so, ect.

There are other questions that are similar to this like

  1. Loading multiple copies of a shared library
  2. Load multiple copies of a shared library

Both suggesting to use

dlmopen(LM_ID_NEWLM, "/path/to/library.so", RTLD_NOW);

as the solution but when i try to do this, it seems to be going up to 10 and then it stops. I know this because each lib will create its own log file when inited correctly and i only see 10 log files created and also the dlmopen call returns null after the 10th call.

I also tried using the RTLD_LOCAL flag in the dlopen call but that makes it always fail.

Details

The architecture of this new project is such that we have a lib loader that will load all the API calls of the lib, we have this loader as a class and a master that can create instances of this loader object. There are no unique threads per loader as the libs already use threads and having so many loaded, we are conserving our thread resources as much as possible.

Question

How can i load copies of the same so files to 30 and maybe more instances?

Update 1

As suggested by Jakob Stark, i used dlerror and got the message of: /lib/x86_64-linux-gnu/libc.so.6: cannot allocate memory in static TLS block when using dlmopen on the 10th call.

Also, turns out i was using dlopen wrong, i now do this:

dlopen("/path/to/library.so", RTLD_NOW | RTLD_LOCAL);

If i do this, the dlopen seems to be called 30 times but i only see 1 log file with all 30 entries in them. I presume what is happening here is that dlopen is recognising that the .so is already loaded and returning true for it, making there be only one .so loaded.

Is it possible to make the .so's different enough so that dlopen thinks they are different .so files?

pSquared
  • 65
  • 7
  • 1
    So you have global state in a shared library. This is generally a bad idea. Have you tried to refactore the library instead? – Jakob Stark Jul 21 '22 at 08:43
  • This is the idea long term, but for now we are on a short deadline and dont have time to refactor the lib to be able to use multiple devices. – pSquared Jul 21 '22 at 08:44
  • 1
    Can you try to print the reason why `dlopen` fails? There is a function named [`dlerror`](https://linux.die.net/man/3/dlopen), that gives you the most recent error of `dlopen` in a human readable string. – Jakob Stark Jul 21 '22 at 08:45
  • Oh, good idea. For the dlmopen solution, the error is "/lib/x86_64-linux-gnu/libc.so.6: cannot allocate memory in static TLS block" – pSquared Jul 21 '22 at 08:52
  • I suspect, that using `dlmopen` instead of `dlopen` is not what you really want. According to [this answer](https://stackoverflow.com/a/33612509/17862371) `dlmopen(LM_ID_NEWLM,...)` is *completely* self contained. That means that each library instance also gets its own copy of `libc.so.6`. I could imagine that there is a limit in the number of glibc instances that you can load... – Jakob Stark Jul 21 '22 at 09:45
  • you already mentioned, that you created 30 differently named copies of the library. You should be able to load each of them using `dlopen` (the version without the `m`) and get different instances of the library. You only need `dlmopen` if you want to load the exact same library file multiple times. Opening 30 different instances into the global namespace (using `dlopen`) is probably much more preformant and does not give you the glibc error, that you found when using `dlmopen(LM_ID_NEWLM,...)`. Could you try that? – Jakob Stark Jul 21 '22 at 12:40
  • By the way, you don't need real copies for that to work. Hardlinks will suffice, which safes some space on the disk and makes it easy to update the library... – Jakob Stark Jul 21 '22 at 13:12
  • For me this there must be some extra problem hidden. Why you need load same library multiple times? What is the reasoning? Do they use global state? Do you control this library (can you alter library - fix its code)? – Marek R Jul 21 '22 at 13:55
  • ok, replying to jakob-stark, i updated my question with some more info about what i tried. Using `dlopen` runs and doesnt complain but the problem is that i see only 1 log file of the lib instead of 30 and instead that 1 log file is the initing of all 30. To me this seems like `dlopen` is not opening more libs but reusing the same one. – pSquared Jul 21 '22 at 14:02
  • Replying to Marek R, I need to load the same library 30 times because 1 library can communicate with 1 physical device and this project needs to communicate with 30 at the same time. Im not sure what global state is, i will read up on it. Yes, we made this library but its highly complex and to make it work with 30 devices would take too long and make us miss our deadline for this project. – pSquared Jul 21 '22 at 14:05

2 Answers2

0

The function dlmopen(LM_ID_NEWLM,...) loads the shared library into a new namespace every time. That means that each new instance of the library pulls in some standard libraries like for example libc.so. I could not find an answer, why in your case exactly 10 was the limit, but the manual says:

The glibc implementation supports a maximum of 16 namespaces.

Probably something different went wrong with the thread local storage, but we can already conclude, that using dlmopen with new namespaces is not the way to go here.


That leaves us with the plain dlopen() function. If one tries to load the same library twice, one gets the same handle twice, which is generally a good thing, as a shared library is meant to be shared. It is however not what we want here.

The most promising solution I can come up with is loading the same library under different names. Symlinks won't work because dlopen() follows them and ends up at the exactly same file. The trick here is to use hardlinks. If your library is called mylib.so you can make several hardlinks using ln:

for i in {0..29}; do ln mylib.so mylib$i.so ; done

Now in your program you open them all by using e.g.

dlopen("mylib0.so", RTLD_NOW);

You will get different instances in the same namespace, with almost no additional disk space used by the different hardlinks.

Jakob Stark
  • 3,346
  • 6
  • 22
  • Thank you for your suggestion. I have tried running that code snippet and making the hard links but alas i still get 1 log file generated with 30 init entries. I should be getting 30 log files with 1 init entry each if there are indeed 30 instances of this lib loaded. – pSquared Jul 22 '22 at 06:23
  • Looking at log files is not a reliable way of checking wether there is one or more instances of the library. What if all the library instances write to the same log file? Anyway you can check the return values of `dlopen` (like print it out) to see if the same handle is returned every time. – Jakob Stark Jul 22 '22 at 08:33
  • @pSquared do you actually load the different library copies with their different names? Like `dlopen("mylib0.so,...)` followed by `dlopen("mylib1.so,...)`, etc.? – Jakob Stark Jul 22 '22 at 08:37
  • Thats a good point. I realized all the instances would use the same id of our logging library so that's probably why they all log to the same log file, because the memory space is shared. I will test by making a few .sos with different ids, in the mean time i have a question about this shared space of the libs. If they all open a TCP socket on the same address and port number, will there be conflicts between them and allow only one device to connect on that port? – pSquared Jul 22 '22 at 08:43
  • 1
    No, i have a map and funciton that will check what index of the lib is not loaded yet and choose that as the next one to be loaded, passing the suffix into a function that will combine the entire name of the lib and attempt to load it. – pSquared Jul 22 '22 at 08:44
  • @pSquared concerning the TCP sockets: Yes you will probably only be able to open a singel socket at a address+port combination at a time. Do you actually need the port to be hard coded? Ususally a TCP client lets the OS choose an unsused port... – Jakob Stark Jul 22 '22 at 08:51
  • ok, incrementing the logging id does indeed generate a unique log file for that lib. And `dlopen` does return different handle hashes for each one so i think this is working. Whats interesting is that i didnt need to make the hard linking, this just apparently worked and i got confused and thought it didnt. Thank you for your continues help and input! Looks like ive got some more work to do on this lib to make it use less global space. – pSquared Jul 22 '22 at 09:21
0

One common solution to this type of problems is to wrap the library in a separate helper process, which communicates with the main process via IPC. in Linux you can probably use popen for that.

On the C++ side, you replace the C++ object representing the device with a stub class implementing the same interface. Since this stub just holds an IPC handle, there's no problem in having 30 copies.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • I am unfamiliar with processes. Is this basically threads? Would spawning threads in the helper to init the lib accomplish the same thing as processes? I will look more into this. – pSquared Jul 22 '22 at 06:31
  • Well, each process has its own thread, but also its own virtual memory etcetera. That separation means the libraries don't see each other. – MSalters Jul 22 '22 at 07:09