0

I'm getting a segmentation fault but it's not clear to me why. I left a bunch of stuff out (hopefully nothing important). The funny thing is that it runs fine with 0 errors with a full leak check in valgrind. Here's all the info (I was working on a random number generator using code I found online):

file1.h

#ifdef __cplusplus
extern "C" {
#endif

#define STATE_N (19937 / 128 + 1)

union W128_T {
   uint32_t u[4];
   uint64_t u64[2];
};

typedef union W128_T w128_t;

struct STATE_T {
    w128_t state[STATE_N];
    int index;
};

typedef struct STATE_T state_t;

#ifdef __cplusplus
}
#endif

This is compiled into a static lib using the following command:

gcc -c -O3 -finline-functions -fomit-frame-pointer -DNDEBUG -fno-strict-aliasing --param max-inline-insns-single=1800 -fPIC -Wmissing-prototypes -Wall -std=c99 src/file1.c -o obj/file1.o
ar rc lib/librand.a obj/file1.o

file2.h

#include "../rand/include/file1.h"

#ifdef __cplusplus
extern "C" {
#endif

state_t * rstate(void);

#ifdef __cplusplus
};
#endif

file2.cpp

#include "file2.h"

static state_t rngState;

state_t * rstate(void) {
   return &rngState;
}

File 2 is compiled into a static library with the following command (some stuff omitted) (from cmake, running make VERBOSE=1:

/usr/bin/c++ -I/home/random/File1include -I/home/file2include -o CMakeFiles/file2Lib.dir/src/file2.o -c /home/src/file2.cpp

Then I test it all in this small test program test.cpp:

#include "file2.h"
#include <cstring>

int main(void) 
{
   state_t * state = rstate();
   state_t save;

   memcpy(&save, state, sizeof(save)); //segmentation fault
}

Which I build with the following command (stuff omitted):

g++ -I/home/random/File1include -I/home/file2include -L/home/file2Lib.dir -Wall -g test.o test.cpp
g++ -I/home/random/File1include -I/home/file2include -L/home/file2Lib.dir -Wall -g test.o -lfile2Lib -o randomTest

If I change test.cpp to this it works fine:

#include "file2.h"
#include <cstring>

int main(void) 
{
   state_t * state = new state_t();
   state = rstate();
   state_t save;

   memcpy(&save, state, sizeof(save));
}

OR if I leave test.cpp alone and change file2.h to this:

#include "../rand/include/file1.h"

#ifdef __cplusplus
extern "C" {
#endif

state_t * rstate(void);
state_t rngState;

#ifdef __cplusplus
};
#endif

And change file2.cpp to this:

#include "file2.h"

state_t * rstate(void) {
   return &rngState;
}

The program also runs correctly. Finally, if I change file2.h to this:

#include "../rand/include/file1.h"

#ifdef __cplusplus
extern "C" {
#endif

state_t * rstate(void);
extern state_t rngState;

#ifdef __cplusplus
};
#endif

and file2.cpp to this:

#include "file2.h"

state_t rngState;

state_t * rstate(void) {
   return &rngState;
}

it also has a seg fault in the test program.

Also,the seg fault occurs at location state->state[34]. When I try printing out state->state[34].u[0] for example.

Any ideas what is happening here?

SSB
  • 349
  • 3
  • 18
  • Some people I've already asked about it thought there may be something with the compiler flags used in the file1.h but after testing that didn't seem to be the case. Also I tried changing the `state_t rngState` to instead be a `static unsigned long rngState[] = {1, 2,3,4}` and didn't have the segfault problem. Perhaps it's related to the extern "C" stuff. – SSB Jan 15 '14 at 00:23
  • Is the segmentation fault on a write or read? In other words, have you overrun the source or the destination of the copy? – Adrian McCarthy Jan 15 '14 at 00:40
  • 3
    Can you make a self-contained example that still exhibits the problem and that we can try ? – cnicutar Jan 15 '14 at 00:43
  • I suspect that the sizes are being computed differently in different modules, possibly because of different packing assumptions. For example, `W128_T` has a 4-byte value followed by an 8-byte value. That may or may not have some padding, depending on the packing. And maybe the packing rules used by the compiler vary depending on whether you're in an `extern "C"` block or not. – Adrian McCarthy Jan 15 '14 at 00:45
  • @AdrianMcCarthy Packing was another thing someone suggested could be causing the problem. I suppose I can try to specify the packing I want in the header. – SSB Jan 15 '14 at 01:22
  • I had checked the size of the state_t structures in the test.cpp file and they were exactly the same. I didn't check the inner type sizes though. – SSB Jan 15 '14 at 01:25
  • I don't know if this has any meaning, but it seems odd to me that your declaration for `rstate()` has `extern "C"`, but the definition does not. – user694733 Jan 15 '14 at 07:11
  • @user694733 see [This question](http://stackoverflow.com/questions/1380829/is-extern-c-only-required-on-the-function-declaration) – SSB Jan 15 '14 at 14:28
  • After more testing I am led to believe that it is something with the build/link to my test program. The sizes of all the types in the `state_t` are equivalent. I have linked the code with a separate application and do not have this issue. Although I don't fully understand the linkage to that program at the moment as it is very complex. – SSB Jan 15 '14 at 17:00
  • I meet exactly the same situation. It's been a long time though, have you found out the reason of segmentation fault? – HQW.ang Feb 23 '22 at 10:42

1 Answers1

0

TL;DR

Think about what this website is called...It's stack overflow!


I have spent almost one day to work out this problem. My codes have identical pattern like the OP's, requiring a static variable copied into a local variable.

The type of the problematic variable is defined using C but is used in a C++ routine. Therefore, it's possible that some incompatibilities between C and C++ cause such problem, which was also the first examination I did. However, I gave up because the C structure should be guaranteed to be trivial and standard layout that a memcpy() can operate.

Next movement was a stupid way that I checked every member of the structure to find out which was the villain. Using the idea of binary search, I quickly narrowed down to few of them, most of which were arrays with a large amount of number. This reminded me of stack overflow.

Compare the values between ulimit -s and sizeof(). Also if you have valgrind installed, try it. The output of valgrind may contain something like below

  • client switching stack SP xxx -> xxx
  • Access not within mapped region at address xxx

And you may even invoke dmesg and see something like segfault at xxx ip xxx sp xxx error 6. The error 6 is explained in https://utcc.utoronto.ca/~cks/space/blog/linux/KernelSegfaultErrorCodes.

Back to valgrind, actually it gives a tip to use --main-stacksize= to temporarily increase your program's upper bound, which can be set to a large enough value to suppress stack problem. For me, then everything goes well.

HQW.ang
  • 119
  • 8