0

I have Bus Error in such code:

char* mem_original;
int int_var = 987411;
mem_original = new char [250];
memcpy(&mem_original[250-sizeof(int)], &int_var, sizeof(int));
...
const unsigned char* mem_u_const = (unsigned char*)mem_original;
...
const unsigned char *location = mem_u_const + 250 - sizeof(int);

std::cout << "sizeof(int) = " << sizeof(int) << std::endl;//it's printed out as 4
std::cout << "byte 0 = " << int(*location) << std::endl;
std::cout << "byte 1 = " << int(*(location+1)) << std::endl;
std::cout << "byte 2 = " << int(*(location+2)) << std::endl;
std::cout << "byte 3 = " << int(*(location+3)) << std::endl;
int original_var = *((const int*)location);
std::cout << "original_var = " << original_var << std::endl;

That works well few times, printing out:

sizeof(int) = 4
byte 0 = 0
byte 1 = 15
byte 2 = 17
byte 3 = 19
original_var = 987411

And then it fails with:

sizeof(int) = 4
byte 0 = 0
byte 1 = 15
byte 2 = 17
byte 3 = 19
Bus Error

It's built & run on Solaris OS (C++ 5.12) Same code on Linux (gcc 4.12) & Windows (msvc-9.0) is working well.

We can see:

  1. memory was allocated on the heap by new[].
  2. memory is accessible (we can read it byte by byte)
  3. memory contains exactly what there should be, not corrupted.

So what may be reason for Bus Error? Where should I look?

UPD: If I memcpy(...) location in the end to original_var, it works. But what the problem in *((const int*)location) ?

Arkady
  • 2,084
  • 3
  • 27
  • 48
  • In addition to my answer, the right place to look is to run the program in a debugger and see which line of code crashes the program and what the pointer that caused the bus error contains. You can then run again and single-step through to see where it got set to that. – Davislor Mar 06 '17 at 18:08
  • @Davislor, unfortunately that not happens all the time. It may work many iterations before Bus Error, and reveals just at Solaris, so it's hard to debug. – Arkady Mar 07 '17 at 08:00
  • 1
    Another debugging technique is to add runtime checks. Define an inline function like: `template inline bool is_aligned( const T* p ) { return (uintptr_t)(void*)(p) % alignof(T) == 0; }`. (If you don’t have C++11, just skip the template and use the constant 4 instead of `alignof(T)`.) Then, whenever you do pointer math and a conversion, you can `assert` that the pointer `is_aligned`. If not, you will crash the program immediately with a diagnostic where the bug occurred. – Davislor Mar 07 '17 at 09:45
  • Whoops, wrote `type` where I meant `class` or `typename`. – Davislor Mar 07 '17 at 23:16

3 Answers3

4

This is a common issue for developers with no experience on hardware that has alignment restrictions - such as SPARC. x86 hardware is very forgiving of misaligned access, albeit with performance impacts. Other types of hardware? SIGBUS.

This line of code:

int original_var = *((const int*)location);

invokes undefined behavior. You're taking an unsigned char * and interpreting what it points to as an int. You can't do that safely. Period. It's undefined behavior - for the very reason you're experiencing.

You're violating the strict aliasing rule. See What is the strict aliasing rule? Put simply, you can't refer to an object of one type as another type. A char * does not and can not refer to an int.

Oracle's Solaris Studio compilers actually provide a command-line argument that will let you get away with that on SPARC hardware - -xmemalign=1i (see https://docs.oracle.com/cd/E19205-01/819-5265/bjavc/index.html). Although to be fair to GCC, without that option, the forcing you do in your code will still SIGBUS under the Studio compiler.

Or, as you've already noted, you can use memcpy() to copy bytes around no matter what they are - as long as you know the source object is safe to copy into the target object - yes, there are cases when that's not true.

Community
  • 1
  • 1
Andrew Henle
  • 32,625
  • 3
  • 24
  • 56
  • @BenVoigt Are you sure? They're immune to a `SIGBUS`? Only one of those can properly be the address of an `int` on SPARC hardware. My experience (admittedly a while ago...) with GCC on SPARC is that they are subject to getting a `SIGBUS`, but that might be an old GCC bug in what *should* be safe code that doesn't violate strict aliasing. – Andrew Henle Mar 06 '17 at 21:58
  • @BenVoigt Yep, got it. I was misreading it - probably colored by my years-ago experiences. I'll fix it up. – Andrew Henle Mar 06 '17 at 22:02
  • Thank you! But let's say I want to work with existing POD structure that is already in memory (char*). Now I cant just create pointer to this structure and cast `char*` to pointer to structure. can I create union of pointers? Like `union { mytype* myptr_; char* charptr_; };`, then without casting, just assign memory to `charptr_`, and then use `myptr_` ? – Arkady Mar 07 '17 at 09:12
  • @Arkady *can I create union of pointers? Like `union { mytype* myptr_; char* charptr_; };`* No, you can't do that. It's not the *pointer* value that you can't alias, it's the object the pointer references that has a fixed type. An arbitrary offset into an `[unsigned] char` array cannot be treated as an `int`. You can copy the data *at that address* into an `int` via `memcpy()` (unless it contains a [trap representation](http://stackoverflow.com/questions/6725809/trap-representation)), but you can not treat it as an `int` directly no matter what you do to the pointers that refer to the object. – Andrew Henle Mar 07 '17 at 10:55
  • @Arkady If you *really* want to do this on Solaris/SPARC, you can use Oracle's compiler (http://www.oracle.com/technetwork/server-storage/developerstudio/downloads/index.html) instead of GCC, and try using the `-xmemalign=1i` command-line option to the compiler. You might incur a *significant* performance degradation. In my experience, it will work. Just be aware that any GCC-compiled code that you pass a misaligned `int` value to will still fail with a `SIGBUS`. – Andrew Henle Mar 07 '17 at 11:03
  • thank you for explanation. But if I have memory heap with size ~500Mb, full of different POD structures. And I know offsets of all of them. And it is very useful to just use `mytype* ptr = (mytype*)&memory_block[offset]`, and then use pointed structure, I wonder if there is no workaround without copying all those ~500Mb, and without `-xmemalign-1i` – Arkady Mar 07 '17 at 13:16
3

I get the following warning when I compile your code:

main.cpp:19:26: warning: cast from 'const unsigned char *' to 'const int *' increases required alignment from 1 to 4 [-Wcast-align]
    int original_var = *((const int*)location);
                         ^~~~~~~~~~~~~~~~~~~~

This seems to be the cause of the bus error, because improperly aligned access can cause a bus error.

Community
  • 1
  • 1
Emil Laine
  • 41,598
  • 9
  • 101
  • 157
  • Even if I know there is 4 bytes in a row? You want to say, that such `*((const int*)location` code is not correct from language perspective, even if developer knows that `location` points to enough number of bytes? – Arkady Mar 06 '17 at 16:45
  • In addition to having 4 bytes in a row, the memory access must be properly aligned for `int`. The machine instruction that performs the access requires that, as pointed out in the linked answer. C++ also requires that. – Emil Laine Mar 06 '17 at 16:50
  • don't you know how can I get `int*` from `char*` proper way, if I know there is array of `int*`? – Arkady Mar 06 '17 at 16:56
  • If there's an `int` at `mem_u_const + 250 - sizeof(int)` and you can't change its location to something that's suitably aligned for `int`, then your only choice is to read the bytes of the `int` separately, something like `int original_var = (int(location[0]) << 24) | (int(location[1]) << 16) | (int(location[2]) << 8) | int(location[3]);`. – Emil Laine Mar 06 '17 at 18:20
  • 2
    @Arkady See http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule When you coerce a `char` address to refer to an `int`, you're invoking undefined behavior. – Andrew Henle Mar 06 '17 at 21:38
  • 3
    @Arkady: See http://stackoverflow.com/a/3903577/103167 for an explanation (mine) of why having 4 bytes in a row isn't good enough. – Ben Voigt Mar 06 '17 at 22:03
2

Although I don’t have access to a SPARC right now to test this, I’m pretty sure from my experiences on that platform that this line is your problem:

const unsigned char *location = mem_u_const + 250 - sizeof(int);

The mem_u_const block was originally allocated by new for an array of characters. Since sizeof(unsigned char) is 1 and sizeof(int) is 4, you are adding 246 bytes. This is not a multiple of 4.

On SPARC, the CPU can only read 4-byte words if they are aligned to 4-byte boundaries. Your attempt to read a misaligned word is what causes the bus error.

I recommend allocating a struct with an array of unsigned char followed by an int, rather than a bunch of pointer math and casts like the one that caused this bug.

Davislor
  • 14,674
  • 2
  • 34
  • 49
  • You want to say, that mem_u_const + 240 would be read without problem, because 240 is a multiple of 4? – Arkady Mar 06 '17 at 18:36
  • I don’t think `new` is guaranteed to return a 4-byte-aligned address when you allocate `char` objects, and if portability is a concern at all, you don’t want your code to break when you recompile it for 64-bit code or on another architecture. Instead of allocating a block of bytes, you should allocate a `struct`. If the memory block needs to be interpreted in several different ways, allocate a `union` of `struct`s. More readable, elegant, portable and safer. – Davislor Mar 06 '17 at 20:34
  • However, if you really don’t want to allocate a struct, you could instead allocate an array of `max_align_t`. The storage for one of those is guaranteed to be properly-aligned for use as a pointer to any scalar type. Or you could use `alignas`. – Davislor Mar 06 '17 at 20:41
  • Thank you for help and answer, if I would have two "accepted" second would be yours. – Arkady Mar 07 '17 at 15:13