1

Take this contrived example:

#include <stdlib.h>
#include <string.h>

int main (int argc, char const *argv[])
{
    char *buf = malloc(8);
    strcpy(buf,"Hello W");

    char *last = &buf[4];

    size_t u = *(size_t *)(last);

    printf("0x%lx",u); // prints "0x57206f" on little endian

    return 0;
}

As per my (rather basic) understanding of C's memory management this would result in the following memory read (assuming 64 Bit):

+--+--+--+--+--+--+--+--+--+--+--+--+--+
|H |e |l |l |o |  |W |\0|? |? |? |? |… |
+--+--+--+--+--+--+--+--+--+--+--+--+--+
             ^^^^^^^^^^^^^^^^^^^^^^^

Thereby accessing a memory region that might have not been allocated to the program and causing a crash. However this seems to work fine in practice – is this defined behavior?

Addition:

I made this example from the code shown here: http://www.daemonology.net/blog/2008-06-05-faster-utf8-strlen.html

dog
  • 390
  • 2
  • 8
  • What is the size of `size_t`? Is it 4? – Codor Jan 14 '15 at 12:02
  • Note that `*(size_t *)(last);` also causes undefined behaviour. Address stored in `last` isn't from `size_t` type, and it is very likely that said address also doesn't have correct alignment for `size_t` type. – user694733 Jan 14 '15 at 12:18
  • `printf("0x%lx",u);` also has undefined behaviour. Sizes of `long` and `size_t` might not be the same. Either use `"0x%zx",u` or `"0x%lx", (long)u` – user694733 Jan 14 '15 at 12:20
  • 1
    @Codor As mentioned above I'm assuming a standard 64 bit platform with sizeof(size_t)==8 – dog Jan 14 '15 at 12:41
  • 1
    [Here](http://stackoverflow.com/questions/20312340/why-does-this-implementation-of-strlen-work) and maybe [here](http://stackoverflow.com/questions/3246008/c-strings-strlen-and-valgrind), basically the same is discussed for common `strlen` implementations. Probably there are even better duplicates here, searching for `strlen` may help finding them. – mafso Jan 14 '15 at 14:39
  • @mafso Thanks, this is what I've been looking for. – dog Jan 14 '15 at 15:06

1 Answers1

2

Your understanding is halting.

What happens is that you get undefined behavior. The exact visible result when that happens is, of course, undefined. You cannot claim "it works because it didn't crash". Not crashing is certainly one thing that can happen, but so is crashing. The behavior is, after all, undefined.

unwind
  • 391,730
  • 64
  • 469
  • 606
  • Yes, but this is only a shortened example. The code I took this from is heavily used, so I'm assuming if there actually is potential to crash I'd have seen a case by now. – dog Jan 14 '15 at 12:37
  • @dog Of course there is a potential for a crash, since you're doing invalid things and getting undefined behavior. – unwind Jan 14 '15 at 12:50