0

(Note: This question had been closed, citing that this had an answer. However, my question is not generic, I am asking why this works in ubuntu/redhat, but not in macos/cygwin. So I have edited this question, by modifying the title, mentioning the words macos and ubuntu.)

I have the following c++ code:

// main.cpp
#include<iostream>
#include<cstdio>
#include "defs.h" // has the function headers only

int func0(int a0) {
    printf("func0-%d\n", a0);
    return a0+1;
}
int func1(int a1) {
    int x;
    x=func0(a1);
    printf("func1-%d\n", x);
    return a1+1;
}
int func2(int a2) {
    int x;
    x=func1(a2);
    printf("func2-%d\n", x);
    return x+5;
}
int main() {
    func1(5);
    func2(8);
}

I can compile and run this code as:

g++ main.cpp; ./a.out

Now I would like to move the functions to different files (func1 to f1.cpp, fun0 and func2 to f2.cpp, and main to main.cpp), and create shared libraries like this:

g++ -c -pipe -std=c++11 -fPIC main.cpp
g++ -c -pipe -std=c++11 -fPIC f1.cpp
g++ -c -pipe -std=c++11 -fPIC f2.cpp
g++ -shared -o libx1.so f1.o
g++ -shared -o libx2.so f2.o
g++ main.o -L. -lx1 -lx2 -o exe
export LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH
./exe

The above instructions work in redhat linux and ubuntu. But when I run the same commands in other variants of linux (eg macos or cygwin) I get errors during creation of the shared library like this:

g++ -shared -o libx1.so f1.o
    undefined reference to func0(int)
g++ -shared -o libx2.so f2.o
    undefined reference to func1(int)

Why is this error happening only in some OS versions, and not happening in redhat/ubuntu? Is it due to the gcc versions, or something to do with the OS?

(The above instructions work with g++ in redhat(gcc version 8.3.1) and ubuntu (9.4.0). It does not work with g++ in cygwin(11.3.0) and in macos(11.2.0).)

R71
  • 4,283
  • 7
  • 32
  • 60
  • 1
    You need to link with the both .o – 273K Sep 07 '22 at 15:45
  • @273K: So why doesnt this error occur in redhat? Is it due to the gcc version, or a redhat specific modification? – R71 Sep 08 '22 at 03:36
  • OK, the answer is hidden deep inside the answer https://stackoverflow.com/a/43305704/1089355, search for "Back in the day" in that answer. Short answer - different distros follow different linking rules. – R71 Sep 09 '22 at 09:24
  • @R71 Not really, the correct answer is not there (yet). Look [here](https://github.com/ziglang/zig/issues/8180) instead. – n. m. could be an AI Dec 29 '22 at 08:30
  • @n.m. Yes, thats exactly the problem. But from your link, I could not quite figure out whats the solution. Can you suggest the steps that I should take to make the above code work on macos? (and also upvote this question for reopening, since the great ones may have closed this in a tearing hurry.) – R71 Dec 29 '22 at 08:53
  • You can use `-undefined dynamic_lookup` linker flag to mimic the Linux behaviour, but in your case you the arguably correct solution for both systems is to (1) restructure the code such that there is no circular dependencies between the libraries and (2) link shared libraries against their dependencies, e.g. `g++ -shared -o libx2.so f2.o -L . -lx1`. – n. m. could be an AI Dec 29 '22 at 09:18
  • It may not be possible to remove all circular dependencies completely. eg, libx1 may be for graph traversal which may call functions in libx2 for machine learning, and libx2 may be using some of the data structures defined in libx1. So I might have to use your 1st solution. Can you pls elaborate on that one? – R71 Dec 29 '22 at 09:24
  • BTW I am not sure if this is ever possible on cygwin (or on Windows in general). Windows DLLs are very different. – n. m. could be an AI Dec 29 '22 at 09:26
  • It is always possible to restructure. You may want to ask a separate question about this. – n. m. could be an AI Dec 29 '22 at 09:28
  • OK. Lets take 1 step at a time. First, your solution for macos. Then get this question reopened. Then lets hope somebody will post an answer for windows. – R71 Dec 29 '22 at 09:28
  • Apparently it is possible on Windows, see [this answer](https://stackoverflow.com/a/6164220/775806) or [this one](https://stackoverflow.com/a/54848623/775806). – n. m. could be an AI Dec 29 '22 at 09:36
  • Not sure what exactly to elaborate. Try adding `-Wl,-undefined,dynamic_lookup` to your compilation commands that build dylibs. BTW I added an answer about Mac OS X to the giant dupe. – n. m. could be an AI Dec 29 '22 at 09:41

1 Answers1

0

The problem is caused by cyclic dependencies between the two libraries. Before doing anything else, you should ask yourself whether it is acceptable to have cyclic dependencies for your project. It is never a good idea, but if the alternative involves massive refactoring, it could be the lesser of two evils. Still, refactoring should probably be the default answer in most cases. If you cannot refactor, the rest of this answer is for you.

How are cyclic dependencies handled on different OSes?

On both Linux and Mac OS X (and on FreeBSD and on most commercial Unixes of old), references are resolved at load time. The loader uses the first suitable symbol definition it encounters, be it it in the main executable, in the shared object itself, or in a different shared object. It is not known until load time where that definition will be found.

So when the executable from the question is loaded, the dynamic loader finds the definition of func1 in libx1 and the definitions of func0 and func2 in libx2, and all is well.

The difference between Linux and Mac OS X lies in the linker (ld) behaviour. Both GNU ld and LLVM ld by default allow unresolved references when building a shared library. Mac OS X ld appears to be of a different breed and unresolved references are not allowed by default. One can either list the dependencies on the link line, or explicitly allow unresolved references using the Mac-specific ld option -undefined dynamic_lookup. But of course when the dependencies are cyclic, the first option is problematic. For code in question:

g++ -shared -o libx1.so f1.o -Wl,-undefined,dynamic_lookup
g++ -shared -o libx2.so f2.o -Wl,-undefined,dynamic_lookup

Windows DLLs work very differently. Each symbol must be resolved at link time. Unlike the Unix-y loaders, the loader must know exactly which DLL to search for each imported symbol. There is no option to allow unresolved references in DLLs because there is no mechanism to resolve them at load time from an unknown source.

Windows still allows cyclic dependencies between DLLs, but the mechanism is a bit different. The linker must use separate import libraries in this case (they are usually optional when using GNU or LLVM toolchains). The linking is done in two phases. First, the .lib files are generated for each future .dll, and then .dll themselves are produced using the .lib files from the first stage. For code in question:

# first stage
g++ -shared -Wl,--out-implib=x1.lib -o x1.dll f1.o
g++ -shared -Wl,--out-implib=x2.lib -o x2.dll f2.o

# second stage
g++ -shared -o x1.dll f1.o x2.lib
g++ -shared -o x2.dll f2.o x1.lib

The first stage will report undefined symbols but will still produce the .lib file needed for the second stage.

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • I feel this question is better serviced by a separate answer because the giant dupe doesn't adequately address the circular dependency question, and adding this answer there would be a bit of a stretch. – n. m. could be an AI Dec 29 '22 at 10:50
  • Thanks anyway! At least you saw fit to answer this question! I will keep experimenting on this. – R71 Dec 30 '22 at 12:34