14

I'm using dlsym() in C and I have a question whether the return value of dlsym() should be explicitly cast or if it is implicitly cast correctly. Here is the function:

double (*(compile)(void))(double x, double y)
{
    if (system("scan-build clang -fPIC -shared -g -Wall -Werror -pedantic "
           "-std=c11 -O0 -lm foo.c -o foo.so") != 0) {
        exit(EXIT_FAILURE);
    }

    void *handle;
    handle = dlopen("./foo.so", RTLD_LAZY);
    if (!handle) {
        printf("Failed to load foo.so: %s\n", dlerror());
        exit(EXIT_FAILURE);
    }

    foo f;
    f = (foo)dlsym(handle, "foo");
    if (!f) {
        printf("Failed to find symbol foo in foo.so: %s\n", dlerror());
        exit(EXIT_FAILURE);
    }
    return f;
}

The function compile() does not take a value and returns a pointer to a function that takes two doubles as input and which returns a double. I then set a system call which compiles a shared object foo.so. I then open foo.so with dlopen(). Then dlsym() finds foo in foo.so and returns an object of type foo which I defined in a header as:

typedef double (*foo)(double, double);

Do I have to cast dlsym()?

lord.garbage
  • 5,884
  • 5
  • 36
  • 55
  • I should point out that I found the man page for `dlsym()` a bit opaque in this regard but if someone can enlighten me on the manpage that would also help. – lord.garbage Jul 20 '15 at 21:46
  • 1
    "The 2013 Technical Corrigendum to POSIX.1-2008 (a.k.a. POSIX.1-2013) improved matters by requiring that conforming implementations support casting 'void *' to a function pointer. Nevertheless, some compilers (e.g., gcc with the '-pedantic' option) may complain about the cast used in this program." When I compile without a cast then currently `gcc` and `clang` complain. – lord.garbage Jul 20 '15 at 21:50
  • unless the compiler complains i would not cast it. If its complaining just make sure to cast it as it see no harm in doing it – Pradheep Jul 20 '15 at 21:54
  • @Pradheep: Casting an object pointer to a function pointer or vice versa is _undefined behaviour_. So, there is definitively potential for harm. – too honest for this site Jul 20 '15 at 22:40
  • the way i see is that , dlsys as i have used always returned a particular function as you are sure that the particular function is going to be present in your so. The only thing that could go wrong is of course the function signature. That could be tested in your unit test to make sure that its correct – Pradheep Jul 20 '15 at 22:54
  • besides i did not downvote the question – Pradheep Jul 20 '15 at 22:55
  • IIRC an old proposal to some POSIX standard defined some `dlfsym` for that exact purpose (but that was not accepted). – Basile Starynkevitch Jul 12 '17 at 19:18
  • BTW, I suggest instead to `typedef` *signatures*, like [here](https://stackoverflow.com/a/9143434/841108) – Basile Starynkevitch Jul 12 '17 at 19:20

2 Answers2

18

The C standard is written to assume that pointers to different object types, and especially pointers to function as opposed to object types, might have different representations. That's why, in general, you don't want to intermix pointers, or if you do a modern compiler will warn you, and if you want to silence the warnings you typically use an explicit cast.

dlsym, on the other hand, by its very existence assumes that all pointers are pretty much the same, because it's supposed to be able to return you any pointer to any datatype -- object or function -- in your object file.

In other words, code that uses dlsym is inherently nonportable, in the sense that it's not widely portable, in the sense that it's portable "only" to those machines where all pointers are safely interconvertible. (Which is of course virtually all of the popular machines today.)

So, yes, you'll need to cast the pointers to silence the warnings, and in doing so you may be making your code less portable to machines where all pointers aren't the same (and where the warnings, if unsilenced, would correctly inform you that your code won't work), but dlsym is never going to work (or even exist in its current form) on those machines, anyway.

(And if gcc -pedantic warns you about even an explicit cast from void * to a function pointer type, there's not much you can do except switch to a different version of gcc, or of course not use -pedantic.)


Addendum: My answer made it sound like converting between pointers to different types of data might be an issue, but that's generally no problem. Type void * is well defined to be the generic data pointer: it's what malloc returns, and it's defined to be quietly convertible to any object pointer type -- that is, you're not supposed to even need a cast. So it's almost a fine choice for the return type of dlsym, except for the wee problem of function pointers. malloc never has this problem (you'd hardly ever try to malloc a pointer to a function), while dlsym always has this problem (the symbols you're generally trying to access in dynamically-loaded object files are code at least as often as they're data). But function pointers are what void * isn't guaranteed to convert to, so you're very likely to get warnings, which is why you need the casts, and you might get warnings under -pedantic even with the casts.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
14

dlsym() returns a void* value. This pointer value can refer either to an object or to a function.

If it points to an object, then no cast is necessary, since C defines an implicit conversion from void* to any pointer-to-object type:

int *ptr = dlsym(handle, "name");

If it points to a function (which is probably much more common), there is no implicit conversion from void* to any pointer-to-function type, so a cast is necessary.

In standard C, there's no guarantee that a void* value can meaningfully be converted to a function pointer. POSIX, which defines dlsym(), implicitly guarantees that the void* value returned by dlsym() can be meaningfully converted to a pointer-to-function type -- as long as the target is of the correct type for the corresponding function.

Assuming we're dealing with a void function with no parameters:

typedef void (*func_ptr_t)(void);
func_ptr_t fptr = (func_ptr_t)dlsym(handle, "name");

As it happens, gcc (with -pedantic) warns about this:

warning: ISO C forbids conversion of object pointer to function pointer type

This warning is not strictly correct. ISO C does not actually forbid converting an object pointer to a function pointer type. The standard lists several constraints that may not be violated by a cast operator; converting void* to a function pointer type is not one of them. The C standard doesn't define the behavior of such a conversion, but since POSIX does in this particular case that's not a problem.

A comment on Steve Summit's answer suggests that this:

*(void **) (&f) = dlsym(handle, "foo");

will silence gcc's warning. It will, but it makes assumptions that are not guaranteed by either C or POSIX. POSIX guarantees that the result of dlsym may be converted to a function pointer; it doesn't guarantee that it has the same representation or alignment. It's likely to work, but I don't recommend it.

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • 3
    Ha, strange that the man page lists `*(void **) (&f)` as an acceptable POSIX workaround that "conforms with the ISO C standard and will avoid any compiler warnings". Thanks for the additional precision. – lord.garbage Jul 21 '15 at 01:20
  • 1
    @brauner: Interesting. It cites "the Rationale for the POSIX specification of dlsym()", but the [POSIX specification of `dlsym()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/dlsym.html) has an empty Rationale section. Perhaps POSIX removed that workaround, and the man page hasn't caught up. – Keith Thompson Jul 21 '15 at 01:44
  • It probably got removed because of this: "The 2013 Technical Corrigendum to POSIX.1-2008 (a.k.a. POSIX.1-2013) improved matters by requiring that conforming implementations support casting 'void *' to a function pointer." Also, the [POSIX specification of `dlsym()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/dlsym.html) lists an explicit cast to a function pointer in one of its examples: `/* find the address of the function my_function */ fptr = (int (*)(int))dlsym(handle, "my_function");`. – lord.garbage Jul 21 '15 at 08:01
  • I tested this line of code today, func_ptr_t fptr = (func_ptr_t)dlsym(handle, "name"); and the C++ function pointer cast causes a null pointer. – Frank Apr 11 '16 at 21:32
  • 1
    Until then it is probably the best idea to disable pedantic warnings for the lines in questions, as it was proposed here already: http://stackoverflow.com/a/36385690/1905491 – stefanct Jun 28 '16 at 11:42