What does getting the address of an array variable mean?

Question

Today I read a C snippet which really confused me:

#include <stdio.h>

int
main(void)
{
    int a[] = {0, 1, 2, 3};

    printf("%d\n", *(*(&a + 1) - 1));
    return 0;
}

In my opinion, &a + 1 makes no sense, but it runs without error.

What does it mean? And does the K&R C bible cover this?

UPDATE 0: After reading the answers, I realize that these two expressions mainly confuses me:

&a + 1, which has been asked on Stack Overflow: about the expression “&anArray” in c
*(&a + 1) -1, which is related to array decaying.

Which version of C are you targeting, and which version of K&R are you reading? The early versions are of historical interest but certainly don't qualify as Bibles in terms of relevance to modern C. — underscore_d, Jul 05 '16 at 11:28
Please go through this : http://stackoverflow.com/questions/11552960/about-the-expression-anarray-in-c — Avantika Saini, Jul 05 '16 at 11:33
@underscore_d I just compile it using gcc 5.3.1 with `-Werror -Wall` flags. And the K&R version which I read is 2nd edition. — whatacold, Jul 05 '16 at 12:34
You might have taken the trouble to tell us what the output was... — TonyK, Jul 05 '16 at 16:15

Some programmer dude · Answer 1 · 2016-07-06T06:11:38.477

43

First a little reminder (or something new if you didn't know this before): For any array or pointer p and index i the expression p[i] is exactly the same as *(p + i).

Now to hopefully help you understand what's going on...

The array a in your program is stored somewhere in memory, exactly where doesn't really matter. To get the location of where a is stored, i.e. get a pointer to a, you use the address-of operator & like &a. The important thing to learn here is that a pointer by itself doesn't mean anything special, the important thing is the base type of the pointer. The type of a is int[4], i.e. a is an array of four int elements. The type of the expression &a is a pointer to an array of four int, or int (*)[4]. The parentheses are important, because the type int *[4] is an array of four pointers to int, which is quite a different thing.

Now to get back to the initial point, that p[i] is the same as *(p + i). Instead of p we have &a, so our expression *(&a + 1) is the same as (&a)[1].

Now that explains what *(&a + 1) means and what it does. Now let us think for a while about the memory layout of the array a. In memory it looks something like

+---+---+---+---+
| 0 | 1 | 2 | 3 |
+---+---+---+---+
^
|
&a

The expression (&a)[1] treats &a as it was an array of arrays, which it definitely isn't, and accessing the second element in this array, which will be out of bounds. This of course technically is undefined behavior. Let us run with it for a moment though, and consider how that would look like in memory:

+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | . | . | . | . |
+---+---+---+---+---+---+---+---+
^               ^
|               |
(&a)[0]         (&a)[1]

Now remember that the type of a (which is the same as (&a)[0] and therefore means that (&a)[1] must also be this type) is array of four int. Since arrays naturally decays to pointers to its first element, the expression (&a)[1] is the same as &(&a)[1][0], and its type is pointer to int. So when we use (&a)[1] in an expression what the compiler gives us is a pointer to the first element in the second (non-existing) array of &a. And once again we come to the p[i] equals *(p + i) equation: (&a)[1] is a pointer to int, it's p in the *(p + i) expression, so the full expression is *((&a)[1] - 1), and looking at the memory layout above subtracting one int from the pointer given by (&a)[1] gives us the element before (&a)[1] which is the last element in (&a)[0], i.e. it gives us (&a)[0][3] which is the same as a[3].

So the expression *(*(&a + 1) - 1) is the same as a[3].

It's long-winded, and passes through dangerous territory (what with the out-of-bounds indexing), but due to the power of pointer arithmetic it all works out in the end. I don't recommend you ever write code like this though, it needs people to be really know how these transformations work to be able to decipher it.

edited Jul 06 '16 at 06:11

answered Jul 05 '16 at 11:37

Some programmer dude

400,186
35
402
621

4

+infinity for "This of course technically is _undefined behaviour_." An out-of-bounds access shouldn't be used as a 'clever' hack. It's not. This might seem to make sense conceptually but isn't required to work in the slightest. It's difficult to believe you're the only one who mentioned this! – underscore_d Jul 05 '16 at 13:15
1

@underscore_d `*(&a+1)` dereferences the pointer past the end of the (quasi) array, here we're in UB-land. (Joachim, the type of `a` is `int(*)[4]`, you consistently wrote 3 [or three]. Doesn't affect the meat, but it's still irritating to me.) – Daniel Fischer Jul 05 '16 at 21:03
1

@DanielFischer: Would writing the code as `*((int*)(&a + 1)-1)` make the behavior defined? Computation of `&a+1` is legal, and in the language Dennis Ritchie designed, the result of casting a "just past" pointer of an array type to the element type would be a "just past" pointer for the last element of the nested array. I'm not sure such logic holds in the subset of Ritchie's language that gcc processes, however. – supercat Jul 05 '16 at 23:56
3

@supercat there's no point holding onto 40-year-old happenstance – M.M Jul 06 '16 at 04:21
@DanielFischer Regarding the type, yeah I though about it as I was falling asleep last night. :) Updated. – Some programmer dude Jul 06 '16 at 06:12
1

@M.M: Why do you regard such things as happenstance? If you don't like the language Dennis Ritchie invented, invent something else. Don't try to name-drop off the language that became popular precisely because it supported the kinds of abilities you now hold in disdain (and which incidentally remains useful on embedded systems because of those abilities). – supercat Jul 06 '16 at 09:05
1

@supercat Ridiculous argument, almost every named product develops over time, changing the feature set. Just one example out of millions: a "refrigerator" used to always come with a butter conditioner, which was quite useful. These days most don't; should we ban companies from using the name "refrigerator" for appliances without butter conditioners ? – M.M Jul 06 '16 at 12:47
1

@M.M: The language Dennis Ritchie designed is still the best language that exists for many kinds of embedded programming, and I cling to what you call "40-year-old happenstance" because it's has far more practical utility, at least in the embedded systems world, than more modern notions. I'm somewhat puzzled--perhaps you can explain--why people who are writing programs for high-end PCs that can only achieve good performance through instruction reordering and vectorization, would want to program in a language which is poorly suited to such things, rather than one like FORTRAN which was... – supercat Jul 06 '16 at 13:59
1

...designed from the beginning to favor them. C was designed to allow someone with a simple compiler to generate code which would perform reasonably efficiently on machines with fairly straightforward general-purpose byte-addressable architectures, by letting the programmer fashion the code to exploit whatever particular talents the target architectures happened to possess. For some reason, people seen to think the Standard was intended to describe everything a quality implementation must do, even though such a view would be nonsensical and the authors acknowledge it as such. – supercat Jul 06 '16 at 14:12
It would be possible to write a fully-compliant C89 implementation which never even bothered to examine the source code but always output "This is a diagnostic" and exited, provided it came with a C source text which nominally tested all the implementation limits, output "This is a diagnostic", and exited. Feeding that particular source code would cause the "implementation" to behave in required fashion, as would feeding it an ill-formed source program. If an implementation can run at least one program that taxes implementation limits, there's no requirement that the implementation... – supercat Jul 06 '16 at 14:17
...be able to run any others without a stack overflow, nor is there any requirement as to what would happen if a program couldn't be run without a stack overflow. The only way the C89 Standard can be viewed at all meaningfully is if one recognizes that the authors intended that any usable quality implementation would do things in excess of what the Standard requires, with abilities likely being tailored to the target platform; I must say it's ironic that embedded processors offer better semantics than full-fledged PCs. – supercat Jul 06 '16 at 14:21
@supercat nobody's stopping you using a 1970s compiler if that's what you want to do – M.M Jul 06 '16 at 23:12
@M.M: I'd prefer a 1990s compiler; fortunately, the embedded systems world has thus far largely escaped gcc-style craziness (excluding gcc itself, of course) but it would be nice to have some assurance that sanity will continue to reign. I'm still curious why you view deliberately-designed behaviors as "happenstance", but don't view C89's "character-type aliasing" rules likewise? The latter should have been deprecated 25 years ago, with defined ways of saying "This pointer of type X will be used to access things of type Y" and "Within this compilation unit, pointers of character type... – supercat Jul 06 '16 at 23:25
...won't alias anything other than character types outside those places where such aliasing is explicitly noted". Adding the latter directive would greatly enhance the performance of code that needs to work with actual character-type data without breaking anything (since existing code wouldn't contain the latter directive, compilers would presume that character-type pointers within such code could alias as they always have). – supercat Jul 06 '16 at 23:28
+1 for the ascii diagram and detailed explanation. But personally I think explaining `*(&a + 1)` using `(&a)[1]` makes it harder to understand. – whatacold Jul 07 '16 at 09:16
@supercat As far as I can tell, `*((int*)(&a+1)-1)` is fine. – Daniel Fischer Jul 07 '16 at 11:31

score 14 · Accepted Answer · answered Jul 05 '16 at 11:46

Let's dissect it.

a has type int [4] (array of 4 int). It's size is 4 * sizeof(int).

&a has type int (*)[4] (pointer to array of 4 int).

(&a + 1) also has type int (*)[4]. It points to an array of 4 int that starts 1 * sizeof(a) bytes (or 4 * sizeof(int) bytes) after the start of a.

*(&a + 1) is of type int [4] (an array of 4 int). It's storage starts 1 * sizeof(a) bytes (or 4 * sizeof(int) bytes after the start of a.

*(&a + 1) - 1 is of type int * (pointer to int) because the array *(&a + 1) decays to a pointer to its first element in this expression. It will point to an int that starts 1 * sizeof(int) bytes before the start of *(&a + 1). This is the same pointer value as &a[3].

*(*(&a + 1) - 1) is of type int. Because *(&a + 1) - 1 is the same pointer value as &a[3], *(*(&a + 1) - 1) is equivalent to a[3], which has been initialized to 3, so that is the number printed by the printf.

Nice dissection, which makes it clear to me that what confused me are two things. 1. pointers to arrays 2. array decaying. Thanks. — whatacold, Jul 06 '16 at 08:12

score 7 · Answer 3 · edited May 23 '17 at 12:33

7

&a + 1 will point to the memory immediately after last a element or better to say after a array, since &a has type of int (*)[4] (pointer to array of four int's). Construction of such pointer is allowed by standard, but not dereferencing. As result you can use it for subsequent arithmetics.

So, result of *(&a + 1) is undefined. But nevertheless *(*(&a + 1) - 1) is something more interesting. Effectively it is evaluated to the last element in a, For detailed explanation see https://stackoverflow.com/a/38202469/2878070. And just a remark - this hack may be replaced with more readable and obvious construction: a[sizeof a / sizeof a[0] - 1] (of course it should be applied only to arrays, not to pointers).

edited May 23 '17 at 12:33

Community

1
1

answered Jul 05 '16 at 11:20

Sergio

8,099
2
26
52

Link to [pointer arithmetic](http://stackoverflow.com/questions/394767/pointer-arithmetic). – Ivan Rubinson Jul 05 '16 at 11:29
I still don't understand. a is the address of the first element of the array, i.e the address of the array. What is &a ? What address is that ? It logically should be the address of the pointer that points on the array, but this pointer actually doesn't exist, does it ? – Tim Jul 05 '16 at 11:31
1

@TimF Wrong. `a` is the _name_ of the array, which implicitly _converts_ ("decays") to a pointer to the first element in some contexts, like passing to a function. But it's not the same thing. It has a distinct type (until lost via decay). `sizeof(a)` returns the size of the array, not the size of a pointer to an element. – underscore_d Jul 05 '16 at 11:34
1

@underscore_d So in this context, a and &a is the same, is that correct ? I understand that a is the name of the array and that its value is the address of the array. What I wonder is what value is there in the expression &a since I literally translate it to "the address of the address of the array" but it does not refer to an existing pointer variable. – Tim Jul 05 '16 at 11:37
@TimF `&a` means 'give me a pointer-to-`int[4]`'. That's distinct from `&a[0]`, which is a pointer to `int`. They probably hold the same value in practical cases, but they're distinct types. `a` is something else, the _name/value_ of the array, although it can implicitly convert to `int *` in mentioned contexts. `&a` can't. – underscore_d Jul 05 '16 at 11:39
@underscore_d Thank you, so to paraphrase a and &a are the same value (the address of a[0] but are not seen the same way by the compiler since one points on the whole array whereas the other one points on the the first element of the array ? – Tim Jul 05 '16 at 11:41
@TimF No, `a` is the array itself i.e. the value, `&a` is a pointer to that array, and `&a[0]` is a pointer to its first element. Each is a distinct type. The fact that implicit conversions occur between certain of these types in certain high-profile cases is just an unfortunate piece of historical baggage that serves only to confuse people... but yes, after such conversions, all 3 hold the same address in practicality. But they still don't have the same type! Here's a good argument in favour of C++'s `std::array`, actually, bolting type-safety onto arrays to avoid this mess. – underscore_d Jul 05 '16 at 11:47
1

@underscore_d Thank you, your explanation and this link did help me : http://stackoverflow.com/questions/11552960/about-the-expression-anarray-in-c – Tim Jul 05 '16 at 11:50
@TimF Nice link - I knew there'd be a question about this out there already ;-) You're welcome. – underscore_d Jul 05 '16 at 11:52
1

`&a + 1` to get a pointer is allowed, but because the resulting pointer is out of the declared bounds, dereferencing it causes UB. The code in the OP is pointlessly complex and brittle. – underscore_d Jul 05 '16 at 13:16
@underscore_d I don't believe `&a + 1` is allowed, since `a` is not a member of an array. It is permitted to construct (but not dereference) a pointer one past the end of an array object, as a special case, but I don't think that applies here: you can't infer that `&a + 1` is safe just because you know `a + 4` is safe and might point to the same thing. I'm willing to be shown wrong. – trent Jul 05 '16 at 15:28
2

@trentcl I'm certainly not defending any of these unnecessary pointer acrobatics, so to me, the more of these constructs are UB, the merrier! That said... regarding 'one past the end' of arbitrary pointers: http://stackoverflow.com/a/14505930/2757035 "5.7(4) [...] says: "For the purposes of these operators, **a pointer to a nonarray object behaves the same as a pointer to the first element of an array of length one** with the type of the object as its element type." So, still not worth doing, but legal by the looks of it. – underscore_d Jul 05 '16 at 15:33
@underscore_d Thanks, guess I was mistaken about that. – trent Jul 05 '16 at 15:52
2

This answer should clarify that `*(&a + 1)` causes undefined behaviour. (It does correctly say that `&a + 1` cannot subsequently be dereferenced, but based on comments to other answers, some readers don't realize that `*(&a + 1)` dereferences `&a + 1`) – M.M Jul 06 '16 at 04:23
@M.M: A unary * operator which is balanced by an address-of operator is not regarded as an access to the target; I see no reason to believe the same would not be true of a unary * operator which is balanced by an implicit array-to-pointer decay. The Standard defines "access" as being a read or a write, and the indicated expression makes no attempt to read the past-one element nor any attempt to write it; ergo, it makes no attempt to access it. – supercat Jul 07 '16 at 05:12
1

@supercat the standard explicitly says that only the combination of operators `&x[y]` and `&*x` have special cases. We agree there is no attempt to access memory, "dereference" doesn't mean "access" – M.M Jul 07 '16 at 05:35
@M.M: A search for "dereferenc" in N1570 finds one mention of the term, in footnote 102, and I don't see the term defined anywhere. I also don't see anything explicitly saying there are no other special cases, or that array-to-pointer decay isn't to be regarded as being a form of taking the address of something. – supercat Jul 07 '16 at 06:07
@supercat see C11 6.5.3.2/4 for the language used by the standard ; if a pointer points to an object then `*` operator produces an lvalue designating that object (except for the cases covered in point 3). – M.M Jul 07 '16 at 06:34
@M.M: The language about `+` is probably more relevant, but another issue would be whether, even given `int a[2][5], int (*p)[5]=a;` which would make `*(&p+1)` well-defined, the behavior of `(*(&p+1))+x` would be defined for any values of x outside the range 0 to 4. My suspicion is that array decay yields a type of restricted-usage pointer that supports limited accessing, but writing the expression as `(int*)(*(&p+1))+x` should "detach" the pointer and make it usable to access any element within the original array. – supercat Jul 07 '16 at 14:44
@M.M: A fundamental problem, of course, is that when C was invented things like array indexing and struct member access were *defined* in terms of address manipulation, and C never cleanly distinguished between array indexing (which should only access things within an array) and pointer manipulations.(which should be able to yield pointers to things whose location is known relative to that of the array). I think adding a cast should give adequate notice "This code wants to work manipulate pointers rather than index an array", but at least in other contexts gcc seems to ignore casts... – supercat Jul 07 '16 at 14:49
...that should serve as notice that the pointer after casting should not be presumed to used to access "the same object" as the pointer before the cast. – supercat Jul 07 '16 at 14:51

Mark Lakata · Answer 4 · 2016-07-06T16:25:12.580

2

Best to prove it to yourself:

$ cat main.c
#include <stdio.h>
main()
{
  int a[4];
  printf("a    %p\n",a);
  printf("&a   %p\n",&a);
  printf("a+1  %p\n",a+1);
  printf("&a+1 %p\n",&a+1);
}

And here are the addresses:

$ ./main
a    0x7fff81a44600 
&a   0x7fff81a44600 
a+1  0x7fff81a44604
&a+1 0x7fff81a44610

The first 2 are the same address. The 3rd is 4 more (which is sizeof(int)). The 4th is 0x10 = 16 more (which is sizeof(a))

edited Jul 06 '16 at 16:25

answered Jul 05 '16 at 21:21

Mark Lakata

19,989
5
106
123

A practical example :)I think the output will be more clear if we `printf` each address per line. – whatacold Jul 06 '16 at 07:56

Vlad from Moscow · Answer 5 · 2016-07-06T13:23:36.097

1

If you have an object of type T, for example

T obj;

then the declaration

T *p = &obj;

initializes the pointer p with the address of the memory occupied by the object obj

Expression p + 1 points to the memory after the object obj. The value of the expression p + 1 is equal to the value of &obj plus sizeof( obj ) that is equivalent to

( T * )( ( char * )&obj + sizeof( obj ) )

So if you have the array shown in your post int a[] = {0, 1, 2, 3}; you can rewrite its declaration using a typedef the following way:

typedef int T[4];

T a = { 0, 1, 2, 3 };

sizeof( T ) in this case is equal to sizeof( int[4] ) and in turn is equal to 4 * sizeof( int )

The expression &a gives the address of the memory extent occupied by the array. The expression &a + 1 gives the address of the memory after the array and the value of the expression is equal to &a + sizeof( int[4] )

On the other hand, an array name used in expressions - with rare exceptions, for example using an array name in the sizeof operator - is implicitly converted to a pointer to its first element.

Thus, the expression &a + 1 points to the imagined element of type int[4] after the real first element a. The expression *(&a + 1) gives this imagined element. But as the element is an array that is of type int[4] then this expression is converted to pointer to its first element of type int *

This first element follows the last element of the array a. And in this case, the expression *(&a + 1) - 1 gives the address of this last element of the array a

By dereferencing in *(*(&a + 1) - 1) you get the value of the last element of the array a, so the number 3 will be output.

edited Jul 06 '16 at 13:23

answered Jul 05 '16 at 12:43

Vlad from Moscow

301,070
26
186
335

3

`&a + 1` is not equal to `&a + sizeof(int[4])` – M.M Jul 06 '16 at 04:19
@M.M Can you explain why not? We assume that for any `T`, `&t + 1 == t + sizeof(T)`. – underscore_d Jul 06 '16 at 07:41
@M.M: I assume you mean `(char*)(&a+1) != ((char*)a)+sizeof (int[4])`? The latter expression you had would be equal to `((char*)a)+sizeof (int[4]) * sizeof (int[4])`, which would clearly not be equal. – supercat Jul 06 '16 at 10:00
@underscore_d `sizeof(int[4])` is some integer, lets say `16` for argument's sake; clearly `x + 1` and `x + 16` are different regardless of what `x` is – M.M Jul 06 '16 at 12:40
@supercat no, I mean that the expression `&a + 1` does not have the same value as the expression`&a + sizeof(int[4])` (this answer claims they do have the same value) – M.M Jul 06 '16 at 12:41
@M.M Whoops. I was thinking in terms of `( (char *)&a[0] ) + sizeof( int[4] )`. You're right of course. The answer as it stands does not take into account the fact that pointer arithmetic adjusts for element size. – underscore_d Jul 06 '16 at 12:41
@M.M In my post there is written clear enough. So your comment does not make sense. Reread my post. – Vlad from Moscow Jul 06 '16 at 13:00
@underscore_d The value of the expression &a + 1 is equal to the val;ue stored in &a + sizeof( int[4] ). – Vlad from Moscow Jul 06 '16 at 13:01
2

@VladfromMoscow People expect when you put things in code tags, that it's actually code. In code, `&a + 1` and `&a + sizeof( int[4] )` are two different things. It's unnecessary confusing to write one when you really mean the other. You could've just written the `char`-casted version to make it clear that you were speaking conceptually in the second case. – underscore_d Jul 06 '16 at 13:18

score 0 · Answer 6 · answered Jul 05 '16 at 12:25

0

Note that the following is equivalent, but equally nasty:

printf("%d\n", (&a)[1][-1]);

In this case it is in my opinion more explicit what happens:

a pointer to the array a is taken

the pointer is used as if it were an array: an array of elements like a, i.e. arrays of 4 integers the 1st element of this array is used.
Since a is not actually an array, but only one element (consisting of four sub-elements!) this indexes the piece of memory directly after a
the [-1] reads the integer directly preceding the memory directly after a, which is the last sub-element of a

answered Jul 05 '16 at 12:25

slingeraap

484
1
5
14

Yup, equally nasty because both invoke UB for no valid reason... unless perhaps to make whoever wrote the snippet feel clever? – underscore_d Jul 05 '16 at 13:21
@underscore_d It's not clear to me that it is undefined behavior to dereference a pointer to the end of an array of arrays, ending up with (in theory if it's not UB) a pointer to the end of the last element, and never at any point accessing any non-array elements. I think that depends on how exactly the "pointer to the end of an array" stuff is worded. – Random832 Jul 05 '16 at 16:16
@Random832 It's definitely not clear. I've since found this - http://stackoverflow.com/questions/38202077/what-does-getting-address-of-array-variable-mean/38203426?noredirect=1#comment63838804_38202157 - showing that 'one past the end' is defined even for non-array types, but I can't tell whether that's relevant or not to this scenario. This might be considered dereferencing 1 past the end to get an array value, which then decays to an `int *` - and while forming a pointer 1 past the end is valid... dereferencing it is never. Dunno. Even if legal, it's a corner case and unnecessary obfuscation – underscore_d Jul 05 '16 at 16:24
Yes, I was referring to that, but of course you can't dereference it. The question is whether a dereference to an array that immediately decays back to a pointer "counts" as a real dereference, and IIRC there are other situations where similar things don't seem to. But, yes, an obscure corner case. – Random832 Jul 05 '16 at 16:27
@underscore_d: Applying things to an inner-level, given `int a[2][5];`, can `(int*)&(a[0][0]);` only be used as a pointer to the first of five consecutive integers, or can it be used as pointer to ten? I would think that `a[0]` is a pointer to an `int[5]`, and could only be used directly to access the first five integers, but `(int*)&(a[0][0])` should "erase" that type information yielding an unrestricted `int*`. While I can see aliasing benefits to not recognizing such behavior, there is semantic usefulness to being able to access an array's contents as a linear sequence and I can't think... – supercat Jul 05 '16 at 17:24
...how else one would practically do that. – supercat Jul 05 '16 at 17:24

sim · Answer 7 · 2016-07-05T11:40:57.907

-1

*(*(&a + 1) - 1)

is an awkward and dangerous way to address the last element in an array. &a is the address of an array of type int[4]. (&a+1) gives the next int[4] array after the currently addressed one a. By dereferencing it using *(&a + 1) you make it to a *int, and with the additional -1 you point to the last element of a. Then this last element is dereferenced and thus the the value 3 is returned (in your example).

This works well if the type of the array elements has the same length as the alignment of the target CPU. Consider the case you have an array of type uint8 and length 5: uint8 ar[]={1,2,3,4,5}; If you do the same now (on a 32-bit architecture), you address an unsed padding byte after the value of 5. So ar[5] has an address that is aligned to 4 bytes. The individual elements in ar are aligned to single bytes. I.e., the address of ar[0] is the same as the address of ar itself, the address of ar[1] is one byte after ar (and not 4 bytes after ar), ..., the address of ar[4] is ar plus 5 bytes and thus not aligned to 4 bytes. If you do (&a+1) you get the address of the next uint8[5] array, which is aligned to 4bytes, i.e., it is ar plus 8 bytes. If you take this address of ar plus 8 bytes and go one byte back, you read at ar plus 7, which is not used.

edited Jul 05 '16 at 11:40

answered Jul 05 '16 at 11:30

sim

1,148
2
9
18

Why the padding byte? Which architecture aligns `char` to 16 bits? x86 doesn't care about the alignment of `char`, and even if it did, you'd assume it'd align to the native word length (32 or 64). – underscore_d Jul 05 '16 at 11:33
1

"the address of ar[4] is ar plus 5 bytes". No, it's not, it's `&ar[0] + 4`. Array indexing is just pointer arithmetic, even if the index is out-of-bounds. And why do you think a `char[4]` has any alignment greater than that of `char` itself? Only the alignment of the element type matters, not the array as a whole, and arrays cannot be demarcated by padding bytes: http://stackoverflow.com/q/13284208/2757035 – underscore_d Jul 05 '16 at 11:56
Since this thread has gone HNQ, I'd like to clarify for new readers the conclusion from the thread linked in my previous comment: it seems clear that _an array is **not allowed** to have alignment different from that of its elements_. So certainly it is _not_ the case that an array of `char[n]` will have `alignof == (n - 1) / sizeof(int) + 1`. It's funny that the example was `char`, which is by definition the type with the weakest alignment and which must be capable of superimposing over all objects to read their representation. I can't begin to imagine how the 2nd paragraph here was dreamt up – underscore_d Jul 05 '16 at 18:53
It seems that no one besides of you misunderstood that answer. I stated that array elements can be misaligned regarding the CPU alignment, but the array itself is aligned regarding the CPU alignment. E.g., char[5] ar is aligned to 4 on a 32 bit architecture, thus ar[0] is also aligned, but ar[1] is not. – sim Jul 05 '16 at 19:18
1

No, still nonsense. C/C++ do not align arrays to any size greater than one of their elements _and_ do not permit misalignment of any object relative to the architecture, regardless of whether said object happens to live in an array (excluding people shooting themselves in the foot with casts). For both of C `#include #include int main(void) { printf("%zu", alignof( char[5] ); }` and C++ `#include int main(int, char **) { std::cout << alignof( char[5] ) << std::endl; }` => I get **`1`** on a 64-bit architecture. And you? – underscore_d Jul 05 '16 at 19:46
@sim I'm not sure what point you are trying to make, but `char` is usually aligned to 1 not 4. You seem to be thinking that the char array is going to be cast to an int array at some point. The original question was about ints, so I am not sure why you are bringing up chars. – Mark Lakata Jul 05 '16 at 21:30
1

Arrays are not allowed to have padding – M.M Jul 06 '16 at 04:22
OT caveat to my 3rd comment: `std::uint8_t` is _usually_ but **not** required to be a `typedef` for `unsigned char`. So all my talk of special allowance, etc only apply to the 3 (count them) `char` types. Having said that, it doesn't affect the untruths about array alignment and padding here. – underscore_d Jul 06 '16 at 07:36
@underscore_d: A compiler can't enforce an alignment on an array type which is not a multiple of the size of the array. If the Standard doesn't define the effect of casting a pointer to an array type in cases where the pointer in question doesn't identify the first item of an array of that type (and I don't think it does) an implementation could impose alignment requirements on arrays which are coarser than those of the element type. Implementations that do so would be incompatible with some programming techniques that are useful on platforms with sensible aliasing rules, but... – supercat Jul 06 '16 at 10:10
...might not be particularly damaging on platforms whose semantics are already crippled. – supercat Jul 06 '16 at 10:10

What does getting the address of an array variable mean?

7 Answers7

Linked

Related