0

The codes in C has been running fine on Ubuntu 14.04.1 LTS on 64 bit AMD C50x2. When "-lm" linked statically and ran the same test on the same environment, it dumps core on run time. It also passed the ldd test. Only thing that was changed was "-lm" was statically linked:

gcc .... -static -lm Later tried with the full path for the "-lm" library - it dumped core again.

Tried with the trace command:

execve("./mypro", ["./mypro"], [/* 61 vars */]) = 0
uname({sys="Linux", node="Acer", ...})  = 0
brk(0)                                  = 0x2668000
brk(0x26691c0)                          = 0x26691c0
arch_prctl(ARCH_SET_FS, 0x2668880)      = 0
readlink("/proc/self/exe", "/home/owner/wfiles/mypro", 4096) = 23
brk(0x268a1c0)                          = 0x268a1c0
brk(0x268b000)                          = 0x268b000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
write(2, "Expecting two argume"..., 35Expecting two argument
) = 35
exit_group(1)                           = ?
+++ exited with 1 +++ 

Update: 1) I only had one library. Also, the order as I compiled:

gcc a.c b.c -o myprogramEXE -static -lm

2) I ran gdb and bactrace - the issue is possibly something to do with Linux and malloc. Part of the code was taken from Numerical Recipie in C (NRC) which used -

void    *malloc(int);

It was incompatible to Linux and in lieu of it, I added another include file. The segmentation fault occurs on this function below from NRC, where it says free():

void free_vector(v,nl,nh)
float *v;
int nl, nh;
/* Frees a float vector allocated by vector().    */
{
    free((char*) (v+nl)); 
}

The following function was used by NRC to create the vector:

float *vector (nl,nh)
int nl, nh;
{
    float *v;

    v=(float *)malloc((unsigned) (nh-nl+1)*sizeof(float));
    if (!v) nrerror("allocation failure in vector()");
    return v-nl;
}

How can I fix the issue - why it happens when there is a static link on the same build environ?

Update2: I found revised codes on NRC web site - however my prob is not resolved. http://www.nr.com/pubdom/nrutil.c.txt

void free_vector(float *v, long nl, long nh)
/* free a float vector allocated with vector() */
{
    free((char*) (v+nl-1));
}

float *vector(long nl, long nh)
/* allocate a float vector with subscript range v[nl..nh] */
{
    float *v;

    v=(float *)malloc((size_t) ((nh-nl+1+1)*sizeof(float)));
    if (!v) nrerror("allocation failure in vector()");
    return v-nl+1;
}
  • 2
    Did you debug the core dump, look at the backtrace, etc? – user253751 Apr 01 '15 at 04:12
  • 1
    Trying to access the file `/etc/ld.so.nohwcap` which does not exist. Use `core dump` along with `GDB` to find the exact location in the code causing this segmentation fault. – Santosh A Apr 01 '15 at 04:12
  • 1
    @SantoshA: apparently it's normal for `/etc/ld.so.nohwcap` to be missing: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=409018#10 – Michael Burr Apr 01 '15 at 05:04
  • Note that `-static` forces *all* libraries to be linked statically. If you only want one library to be linked statically, use `-Wl,-Bstatic -lm -Wl,-Bdynamic`. Statically linking libc is known to be problematic. – o11c Apr 01 '15 at 05:13
  • I only had one library. – user4687194 Apr 01 '15 at 05:23
  • I searched but could not find the core? – user4687194 Apr 01 '15 at 05:39
  • [generate a core dump in linux](http://stackoverflow.com/questions/17965/generate-a-core-dump-in-linux) – Anto Jurković Apr 01 '15 at 06:19
  • Did you get seg. fault on a build system or on a completely different system? – Anto Jurković Apr 01 '15 at 06:20
  • @Anto - on the build system. Only thing that I changed was static linking. – user4687194 Apr 01 '15 at 14:48
  • (1) Editions of _Numerical Recipes in C_ as old as the one you're using contain code which is known to be incorrect; you have quoted one of the incorrect functions. (2) `-static -lm` *does* direct the compiler to link the C library statically; there is (to oversimplify a bit) an implicit `-lc` at the end of the command line. To link only the math library statically, write `-Wl,-Bstatic,-lm,-Bdynamic`. (3) [`valgrind`](http://valgrind.org/). – zwol Apr 02 '15 at 00:12
  • But I'm pretty sure the problem is the buggy NRC code. – zwol Apr 02 '15 at 00:13
  • @zwol - why there are commas and also "-wl" - it was complaining of "unrecognized command line option ‘-Wl’". -static -lm would be sufficient right? I thought you need -wl only if u have multiple libraries? – user4687194 Apr 02 '15 at 22:43
  • This particular aspect of GCC's command line behavior is a little weird. You need to write `-Wl,-Bstatic,-lm,-Bdynamic` EXACTLY AS SHOWN, including all capitalization, commas, and absence of spaces. If you do that and it still gives you an error message, post a new question specifically about how to get the math library linked statically, in which you show your *complete, unedited* linker command line invocation. – zwol Apr 03 '15 at 02:30
  • The thing is that `-static` means link *everything* statically, *including the libraries that are implicitly included for you*, `-lc` and `-lgcc`; which is liable to misbehave because the usual C library on Linux-based systems isn't designed to be statically linked. The thing with `-Wl,...` bypasses all compiler driver wackiness and directs the linker proper to link *only* the math library statically. – zwol Apr 03 '15 at 02:33
  • Anyhow, I'm like 90% sure your *actual* problem is the buggy old version of Numerical Recipes in C you're using, as described in Matt McNabb's answer. – zwol Apr 03 '15 at 02:34
  • @zwol - Appreciate it; I inked with the way you suggested: gcc -g K33.c ThsC.c ccor.c foun.c -Wl,-Bstatic -lm -Wl,-Bdynamic -o K33 ; No more segmentation fault. But the ldd K33 gives the following : linux-vdso.so.1 => (0x00007fffe5ffc000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc101ff6000) /lib64/ld-linux-x86-64.so.2 (0x00007fc1023d4000) I expected "not a dynamic executable" Why? – user4687194 Apr 03 '15 at 06:18
  • @zwol - I am ok with the above ldd results as long as K33 will run on a server on which I have no control (I am not allowed to make configuration changes for the most part).. – user4687194 Apr 03 '15 at 06:54
  • ... What we have been trying to tell you all this time is that your program is necessarily linked against the C library as well as the math library, and that it *does not work* to link the GNU C library statically. The instructions I gave you were designed to link *only* the math library statically. If you genuinely need a static executable you should have said so. (You will need to get yourself a copy of uClibc or musl libc in that case.) – zwol Apr 03 '15 at 14:32
  • I have no way of knowing whether your program will run on this server over which you have no control. Your best bet is to try it and see. I'd recommend *not* linking the math library statically if that was the only reason you were doing that -- you are far less likely to run into a problem with the math library, which hasn't changed much in decades, than with the core C library. – zwol Apr 03 '15 at 14:34
  • And again I would like to reiterate that your _actual_ problem is >90% likely to be that you are using an old copy of _Numerical Recipes in C_ whose code is _buggy_. That the program doesn't crash when the C library is dynamically linked may only be an accident! Please go get yourself a current edition and fix the code properly. – zwol Apr 03 '15 at 14:35

2 Answers2

3

return v-nl; causes undefined behaviour.

Pointers may only point to an element of an array (or one past the last element). Writing v - nl tries to form a pointer into the middle of nowhere.

It would be a good idea to redesign this code to not rely on undefined behaviour.


You mention void *malloc(int);, however that would be a bug. The proper signature is void *malloc(size_t);.

In any case you should write #include <stdlib.h> instead, to avoid any possibility of error.

M.M
  • 138,810
  • 21
  • 208
  • 365
  • Why do you say v-nl "causes undefined behaviour" - it is same as &v[-nl]? – user4687194 Apr 06 '15 at 22:45
  • Yes, `&x[y]` is the same as `x+y` when one of them is a pointer. It's not permitted to have pointers point out of bounds of an object (except for a one-past-the-end pointer) – M.M Apr 06 '15 at 23:33
  • Could you elaborate why you think that it points to out of bounds of an object - is this why you are saying undefined behaviour? Is there a quick fix for out of bounds? I am wondering how Num Rec in C get away with it? – user4687194 Apr 07 '15 at 05:37
  • The object `v` starts at `&v[0]` ; trying to point to `&v[-1]` or any other negative index causes UB. There's no quick fix. I haven't read the book but it sounds like the code is just bad and relied on non-standard behaviour of particular compilers. – M.M Apr 07 '15 at 07:46
0

The reason it might dump core under one circumstance (linked with -static -lm) and not otherwise is that the linkage has changed the memory layout - but the program is still misbehaving, just in a way that's not obvious (yet.)

I don't have "Numerical Recipies in C" handy, but if you've retyped the code examples into your own program, I suspect a typographical error.

In your "free_vector(float *v, int nl, int nh)" example, 'v' is a 1 dimensional array of "floats". The expression "v+nl" in your call to "free()" is the same as "&(v[nl])" - i.e.: you're trying to free the 'nl'th element of the array which doesn't make a lot of sense. (If 'v' was an array of POINTERS to floats it would make sense, but then 'v' would be declared as "float *v[]" or "float **v".)

I'm presuming that 'v' (allocated by "vector()") is supposed to be a 1 dimensional array (vector) of floats OR pointers to floats (that's uncertain yet.) The 'nl' and 'nh' seem to be array (vector) index bounds so that you can have arrays bounded from 1 to n instead of 0 to n-1 (or some other range, like 5 to 9) and the "vector()" function takes care of allocating 'v'. Beyond that, I can't guess.

  • This is actually a notorious error in old editions of _Numerical Recipes in C_. The authors thought they could reproduce FORTRAN's 1-based arrays in C by passing around pointers offset one element before the beginning of each array; this works with legacy K&R compilers but gets utterly mangled by modern ones. – zwol Apr 03 '15 at 02:36