0

I'm doing some experiments about dynamic library by the insturctions of the book CSAPP, and I get segmentation fault when I run the program linking my runtime stub library.

The library is as follows.

 #include <dlfcn.h>
 #include <stdlib.h>
 #include <stdio.h>

// int data;

void *malloc(size_t size) {
    void *(*mallocp)(size_t size);
    char *error;

    mallocp = (void*(*)(size_t))dlsym(RTLD_NEXT, "malloc");
    if ((error = dlerror()) != NULL) {
        fputs(error, stderr);
        exit(1);
    }
    char *ptr = NULL;
    ptr = (char*)mallocp(size);

    // printf("malloc(%d) @ %p\n", (int)size, 0);
    return ptr;
}

run gcc -fpic -shared dynamic.cpp -ldl -o dynamic.so -g to compile it. Then gcc -g test.cpp compile the main which simply calls malloc. export LD_PRELOAD and run the executable then get segmentation fault during startup.

After commenting out the printf, it works fine. Someone tells me it's because printf calls malloc. Then what happens? They call each other in a cycle? Then why the segmenattion fault? The code is almost the same with the book. I guess it has something to do with the implementation of my gcc of printf. What should I do in order to use printf in my library?

Employed Russian
  • 199,314
  • 34
  • 295
  • 362
zkh
  • 1
  • 1
  • They call each other in a cycle, true. The segmentation fault occurs when the call stack overflows. – Armali Feb 24 '23 at 09:58
  • @Armali But the seg fault happens during startup, before entering main. I suppose it has something to do with the dynamic linking. – zkh Feb 24 '23 at 13:13
  • How do you know that _the seg fault happens during startup, before entering main_? – Armali Feb 24 '23 at 13:19

1 Answers1

1

After commenting out the printf, it works fine. Someone tells me it's because printf calls malloc.

Yes.

Then what happens? They call each other in a cycle?

Yes.

Then why the segmenattion fault?

Because you get infinite recursion, and that causes (drum roll!) stack overflow.

Your computer does not have unlimited stack.

P.S. Running your binary under GDB shoul be illuminating.

P.P.S.

export LD_PRELOAD and ...

This is a relatively dangerous thing to do: setting LD_PRELOAD this way will affect (and cause to crash) all commands in the current shell.

It is better to set the variable only for the next command, like so:

$ env LD_PRELOAD=./dynamic.so ./a.out

If using bash and similar shells, you can also do equivalent:

$ LD_PRELOAD=./dynamic.so ./a.out

Update:

The fact is that I used gdb and it said the seg fault happened during startup without giving any other information.

When dealing with low-level routines like malloc, you need to be creative. Using tricks from this answer, you can figure out what's going on without too much trouble.

You also need to attach GDB from the outside -- if you set LD_PRELOAD before using GDB run command, the LD_PRELOAD would affect your shell.

Here is an example:

$ cat dynamic.c
#include <dlfcn.h>
#include <stdlib.h>
#include <stdio.h>

volatile int done = 0;

void *malloc(size_t size) {
    void *(*mallocp)(size_t size);
    char *error;

    while (!done) { }

    mallocp = (void*(*)(size_t))dlsym(RTLD_NEXT, "malloc");
    if ((error = dlerror()) != NULL) {
        fputs(error, stderr);
        exit(1);
    }
    char *ptr = NULL;
    ptr = (char*)mallocp(size);

    printf("malloc(%d) @ %p\n", (int)size, 0);
    return ptr;
}

$ gcc -g -fPIC -o dynamic.so dynamic.c

$ LD_PRELOAD=./dynamic.so ./a.out &
[1] 476197

$ gdb -p 476197

0x00007f52b5f5a156 in malloc (size=1) at dynamic.c:11
11          while (!done) { }
(gdb) bt
#0  0x00007f52b5f5a156 in malloc (size=1) at dynamic.c:11
#1  0x0000561590f6814b in main () at main.c:1
(gdb) set var done = 1
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x00007f52b5ea7e2f in __GI__dl_catch_exception (exception=exception@entry=0x7ffc04e7d070, operate=0x7f52b5dde3b0 <dlsym_doit>, args=0x7ffc04e7d110) at ./elf/dl-error-skeleton.c:175
175     ./elf/dl-error-skeleton.c: No such file or directory.
(gdb) bt 20
#0  0x00007f52b5ea7e2f in __GI__dl_catch_exception (exception=exception@entry=0x7ffc04e7d070, operate=0x7f52b5dde3b0 <dlsym_doit>, args=0x7ffc04e7d110) at ./elf/dl-error-skeleton.c:175
#1  0x00007f52b5ea7f4f in __GI__dl_catch_error (objname=0x7ffc04e7d0c8, errstring=0x7ffc04e7d0d0, mallocedp=0x7ffc04e7d0c7, operate=<optimized out>, args=<optimized out>) at ./elf/dl-error-skeleton.c:227
#2  0x00007f52b5ddddc7 in _dlerror_run (operate=operate@entry=0x7f52b5dde3b0 <dlsym_doit>, args=args@entry=0x7ffc04e7d110) at ./dlfcn/dlerror.c:138
#3  0x00007f52b5dde455 in dlsym_implementation (dl_caller=<optimized out>, name=<optimized out>, handle=<optimized out>) at ./dlfcn/dlsym.c:54
#4  ___dlsym (handle=<optimized out>, name=<optimized out>) at ./dlfcn/dlsym.c:68
#5  0x00007f52b5f5a179 in malloc (size=1024) at dynamic.c:13
#6  0x00007f52b5dce76c in __GI__IO_file_doallocate (fp=0x7f52b5f2c760 <_IO_2_1_stdout_>) at ./libio/filedoalloc.c:101
#7  0x00007f52b5ddbf50 in __GI__IO_doallocbuf (fp=0x7f52b5f2c760 <_IO_2_1_stdout_>) at ./libio/libioP.h:947
#8  __GI__IO_doallocbuf (fp=fp@entry=0x7f52b5f2c760 <_IO_2_1_stdout_>) at ./libio/genops.c:342
#9  0x00007f52b5ddb318 in _IO_new_file_overflow (f=0x7f52b5f2c760 <_IO_2_1_stdout_>, ch=-1) at ./libio/fileops.c:744
#10 0x00007f52b5dda4de in _IO_new_file_xsputn (n=7, data=<optimized out>, f=0x7f52b5f2c760 <_IO_2_1_stdout_>) at ./libio/libioP.h:947
#11 _IO_new_file_xsputn (f=0x7f52b5f2c760 <_IO_2_1_stdout_>, data=<optimized out>, n=7) at ./libio/fileops.c:1196
#12 0x00007f52b5db53ae in outstring_func (done=0, length=7, string=<error reading variable: Cannot access memory at address 0x4e7d2e0>, s=0x7f52b5f2c760 <_IO_2_1_stdout_>) at ../libio/libioP.h:947
#13 __vfprintf_internal (s=0x7f52b5f2c760 <_IO_2_1_stdout_>, format=<error reading variable: Cannot access memory at address 0x4e7d2e0>, ap=ap@entry=0x7ffc04e7d820, ) at ./stdio-common/vfprintf-internal.c:767
#14 0x00007f52b5dab4fb in __printf (format=<optimized out>) at ./stdio-common/printf.c:33
#15 0x00007f52b5f5a1e8 in malloc (size=1024) at dynamic.c:21
#16 0x00007f52b5dce76c in __GI__IO_file_doallocate (fp=0x7f52b5f2c760 <_IO_2_1_stdout_>) at ./libio/filedoalloc.c:101
#17 0x00007f52b5ddbf50 in __GI__IO_doallocbuf (fp=0x7f52b5f2c760 <_IO_2_1_stdout_>) at ./libio/libioP.h:947
#18 __GI__IO_doallocbuf (fp=fp@entry=0x7f52b5f2c760 <_IO_2_1_stdout_>) at ./libio/genops.c:342

Here you can clearly see malloc calling printf calling malloc ...

How deep is the stack at that point?

(gdb) bt -4
#42923 __vfprintf_internal (s=0x7f52b5f2c760 <_IO_2_1_stdout_>, format=<error reading variable: Cannot access memory at address 0x567a1c0>, ap=ap@entry=0x7ffc0567a700, ) at ./stdio-common/vfprintf-internal.c:767
#42924 0x00007f52b5dab4fb in __printf (format=<optimized out>) at ./stdio-common/printf.c:33
#42925 0x00007f52b5f5a1e8 in malloc (size=1) at dynamic.c:21
#42926 0x0000561590f6814b in main () at main.c:1

It's 42926 levels deep.

Employed Russian
  • 199,314
  • 34
  • 295
  • 362
  • The fact is that I used gdb and it said the seg fault happened during startup without giving any other information. It said "Starting program: /home/hzk/a.out During startup program terminated with signal SIGSEGV, Segmentation fault." I used LD_DEBUG to see what happened and found that the dynamic linker, I suppose, was keeping binding the malloc in an infinite loop.“symbol=malloc; lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0] 600: binding file ./dynamic.so [0] to /lib/x86_64-linux-gnu/libc.so.6 [0]: normal symbol `malloc'” – zkh Feb 25 '23 at 06:46
  • @zkh I've updated the answer. With `LD_DEBUG` you are likely looking at the wrong process (your shell). – Employed Russian Feb 25 '23 at 14:39