0

Consider this code:

static char a[2][2] = {
    { 1, 2 },
    { 3, 4 },
};

int main()
{
    char **p = (char**)a; // needs cast, or compiler complains (which makes sense)

    printf("%p\n", p);

    printf("%p\n", &a[1][0]);
    printf("%d\n",  a[1][0]);
    printf("%p\n", &p[1][0]); // why null?  why doesn't compiler complain about this?
    printf("%d\n",  p[1][0]); // segfault, of course

    return 0;
}

which yields this output:

0x804a018
0x804a01a
3
(nil)
Segmentation fault

I understand that an array can decay to a pointer. What I don't understand is why the compiler (g++) will let me try to do the reverse. If p is a char**, why does it let me use p[x][x] without so much as a warning? It obviously doesn't even come close to working, as the resulting pointer is null.

Incidentally, I am asking this about code from a 3rd party, which evidently works for them. (compiled in Windows, not with g++). So, I'm not looking for suggestions on how to fix this code, I already know how to do that. I just want to understand why the compiler doesn't complain, and why the result is a null pointer.

Thanks.

3 Answers3

6

You simply cannot start treating a 2D array of char like a char**. In memory, the array looks something like this:

| 1 | 2 | 3 | 4 |

Each element follows the previous element. The array name will be implicitly converted to a pointer to its first element:

| 1 | 2 | 3 | 4 |
  ^
  |

Now if you convert this pointer to a char**, you're saying "If you dereference this pointer, you will find a char*" which is an outright lie. If you dereference the pointer you will get a char with value 1, not a pointer at all.

Then when you do p[1][0], you are treating the value at p[1] (which essentially moves the pointe p along by sizeof(char*)) as a pointer and trying to dereference it. Of course, this is leading you straight to undefined behaviour.

The compiler didn't let you do that cast because it was a silly cast to do. Don't do it. Just because a C-style cast allows you to do it, that doesn't mean it's an okay operation. A C-style cast will fall back to a reinterpret_cast if no other cast works, in which case you're almost certainly going to hit undefined behaviour.

Joseph Mansfield
  • 108,238
  • 20
  • 242
  • 324
  • I understand that this code doesn't make sense. As I said, this is essentially what happens inside some 3rd party code, and I'm trying to understand how it might have ever worked, possibly with the Visual Studio compiler. And I'm also trying to understand why the compiler allows it. BTW, you said p[1] is a char (of value 3), but if p is a char**, then why wouldn't p[1] be a char*? – user2100564 Feb 22 '13 at 19:14
  • @user2100564: it didn't allow it, until you forced it to cast even when it was telling you that the cast made no sense. – Mooing Duck Feb 22 '13 at 19:15
  • @user2100564 Apologies, I made a slight mistake. `p[1]` would be the location of a second `char*` if it were an array of `char*`. But it's not. I can't think of any reason why this would have worked at any point. – Joseph Mansfield Feb 22 '13 at 19:29
  • I meant why does the compiler allow using double indexing (p[1][0]) on a type that isn't a 2D array. But I guess I'm starting to understand why it allows it...it seems the [x] notation means different things depending on the type, which makes sense. – user2100564 Feb 22 '13 at 19:38
  • @user2100564 Actually it means the same thing. When you use `[]` on an array, the array is first converted into a pointer to its first element. Then `pointer[X]` is equivalent to `*(pointer + X)`. – Joseph Mansfield Feb 22 '13 at 19:45
  • @user2100564 The reason the 2D array indexing works on a `char**` is because first `p[1]` is equivalent to `*(p + 1)` which gives you a `char*`, and then `p[1][0]` is equivalent to `*(*(p + 1) + 0)`. – Joseph Mansfield Feb 22 '13 at 19:47
  • By 'means different things', I mean it indexes differently. So a[1] is a "A2_c" type, whereas p[1] is a "Pc" type (using typeid notation), thus &a[1] is &a[0]+2 and is not the same as &p[1], which is &p[0]+4. This makes sense, and I see how the compiler (not knowing what the author intended) doesn't complain. only remaining mystery is how it ever worked.... – user2100564 Feb 22 '13 at 20:05
  • Hmm, unless the size of the element type matches the size of the pointer type. Hmm, I'll have to think about that... The 3rd party code actually uses doubles, whereas my example just used char for simplicity. On a 64bit machine, a double and a ptr are both 8 bytes... – user2100564 Feb 22 '13 at 20:05
2

To actually give an answer: a is an array. For a statement like char* p = a;, a would automatically decay to a pointer to the first element { 1, 2 }, and since that's an array, that would also decay to it's first element 1. However, with char**p = a, a still decays to that array of array of char, and then you're casting that entire array to an array of pointers to chars: (which it interprets as {0x01020304, 0x????????}), which makes no sense at all. It's a pointer to an array, not a pointer to pointers. That's why you needed the cast, because it doesn't make sense.

Second, when you type p[1], it treats that data (and a few bytes after it) as if they were an array of char pointers, {0x01020304, 0x00000000}, and returns the second element. (We can see it's all zeros in this particular case, because that's what printed on the screen later), Then the [0] dereferences that second mystery unknown pointer that happens to be NULL, giving you a segfault.

Mooing Duck
  • 64,318
  • 19
  • 100
  • 158
  • So, does Visual Studio possibly treat this differently? This comes from a batch of 3rd party code that is functional... – user2100564 Feb 22 '13 at 19:22
  • Does Visual Studio treat this differently than what? The cast from `a` to a `char**` _is wrong in every sense_. You say it's functional, but I doubt it if it contains this code. – Mooing Duck Feb 22 '13 at 19:30
  • 1
    @MooingDuck Accessing `p[1]` is undefined behaviour, so it _could_ be that the code works as intended (but then the author should buy a lot of lottery tickets ;) – Daniel Fischer Feb 22 '13 at 19:34
0

When you say this:

char a[2][2];
char **p = (char**)a;

that is a mistake. a is not an array of pointers to characters. It is an array of storage blocks, each of which is an array of characters.

Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135