24

Are arrays and pointers implemented differently in C and C++? I have come across this question because, in both the cases we access elements from the starting address of an element. So, there should be close relation between them. Please explain the exact relation between them. Thanks.

0___________
  • 60,014
  • 4
  • 34
  • 74
yashu
  • 273
  • 1
  • 3
  • 9
  • From a low level assembly point of view an array is nothing more than an allocation of memory where commonly the first spot defines how much space follows. At that point accessing an element is merely an offset from the starting address. A pointer on the other hand is a memory location holding another memory location. So when you access the pointer what you are getting is the value of that memory address which is merely another memory address to the actual data. If you have a pointer to an array than the memory address for the point stores the start of the array (mentioned above). – Chris Oct 18 '10 at 13:44
  • You might find this question interesting: [So you think you know pointers (and arrays)?](http://stackoverflow.com/questions/232303/so-you-think-you-know-pointers). – Cristian Ciupitu Oct 18 '10 at 13:50

7 Answers7

71

Let's get the important stuff out of the way first: arrays are not pointers. Array types and pointer types are completely different things and are treated differently by the compiler.

Where the confusion arises is from how C treats array expressions. N1570:

6.3.2.1 Lvalues, arrays, and function designators

...
3 Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.

Let's look at the following declarations:

int arr[10] = {0,1,2,3,4,5,6,7,8,9};
int *parr = arr;

arr is a 10-element array of int; it refers to a contiguous block of memory large enough to store 10 int values. The expression arr in the second declaration is of array type, but since it is not the operand of & or sizeof and it isn't a string literal, the type of the expression becomes "pointer to int", and the value is the address of the first element, or &arr[0].

parr is a pointer to int; it refers to a block of memory large enough to hold the address of a single int object. It is initialized to point to the first element in arr as explained above.

Here's a hypothetical memory map showing the relationship between the two (assuming 16-bit ints and 32-bit addresses):

Object           Address         0x00  0x01  0x02  0x03
------           -------         ----------------------
   arr           0x10008000      0x00  0x00  0x00  0x01
                 0x10008004      0x00  0x02  0x00  0x03
                 0x10008008      0x00  0x04  0x00  0x05
                 0x1000800c      0x00  0x06  0x00  0x07
                 0x10008010      0x00  0x08  0x00  0x09
  parr           0x10008014      0x10  0x00  0x80  0x00

The types matter for things like sizeof and &; sizeof arr == 10 * sizeof (int), which in this case is 20, whereas sizeof parr == sizeof (int *), which in this case is 4. Similarly, the type of the expression &arr is int (*)[10], or a pointer to a 10-element array of int, whereas the type of &parr is int **, or pointer to pointer to int.

Note that the expressions arr and &arr will yield the same value (the address of the first element in arr), but the types of the expressions are different (int * and int (*)[10], respectively). This makes a difference when using pointer arithmetic. For example, given:

int arr[10] = {0,1,2,3,4,5,6,7,8,9};
int *p = arr;
int (*ap)[10] = &arr;

printf("before: arr = %p, p = %p, ap = %p\n", (void *) arr, (void *) p, (void *) ap);
p++;
ap++;
printf("after: arr = %p, p = %p, ap = %p\n", (void *) arr, (void *) p, (void *) ap);

the "before" line should print the same values for all three expressions (in our hypothetical map, 0x10008000). The "after" line should show three different values: 0x10008000, 0x10008002 (base plus sizeof (int)), and 0x10008014 (base plus sizeof (int [10])).

Now let's go back to the second paragraph above: array expressions are converted to pointer types in most circumstances. Let's look at the subscript expression arr[i]. Since the expression arr is not appearing as an operand of either sizeof or &, and since it is not a string literal being used to initialize another array, its type is converted from "10-element array of int" to "pointer to int", and the subscript operation is being applied to this pointer value. Indeed, when you look at the C language definition, you see the following language:

6.5.2.1 Array subscripting
...
2 A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th element of E1 (counting from zero).

In practical terms, this means you can apply the subscript operator to a pointer object as though it were an array. This is why code like

int foo(int *p, size_t size)
{
  int sum = 0;
  int i;
  for (i = 0; i < size; i++)
  {
    sum += p[i];
  }
  return sum;
}

int main(void)
{
  int arr[10] = {0,1,2,3,4,5,6,7,8,9};
  int result = foo(arr, sizeof arr / sizeof arr[0]);
  ...
}

works the way it does. main is dealing with an array of int, whereas foo is dealing with a pointer to int, yet both are able to use the subscript operator as though they were both dealing with an array type.

It also means array subscripting is commutative: assuming a is an array expression and i is an integer expression, a[i] and i[a] are both valid expressions, and both will yield the same value.

John Bode
  • 119,563
  • 19
  • 122
  • 198
  • 2
    +1 especially for `arr` and `&arr` – pmg Oct 18 '10 at 15:48
  • Purr-fect. Can't upvote more than once, though. – Daniel Fischer Feb 22 '13 at 22:31
  • 2
    Just curious, why did you use a hypothetical example of `sizeof(int)==2`, which, while allowed by the standard, has not been seen on any mainstream platform/compiler in ages? – Baruch Jun 23 '14 at 08:14
  • 4
    @baruch: mostly to make the example memory map a reasonable size. And because I'm old. – John Bode Jun 23 '14 at 13:52
  • 1
    Can you please give example of " or is a string literal used to initialize an array" ? – Suraj Jain Dec 27 '16 at 10:56
  • And also Why Does this not work `char p[6];` `p = "hello" ` – Suraj Jain Dec 27 '16 at 11:04
  • I just linked your nice answer to another question. Are you aware that your memory layout example is big-endian? Was that intentional? I haven't seen big-endian CPUs anymore since my days where I worked on SGI workstations but that's long ago... ;-) – Scheff's Cat Mar 05 '19 at 08:04
23

Don't know about C++. For C, the c-faq answers much better than I ever could.

Small snippet from c-faq:

6.3 So what is meant by the ``equivalence of pointers and arrays'' in C?

[...]

Specifically, the cornerstone of the equivalence is this key definition:

A reference to an object of type array-of-T which appears in an expression decays (with three exceptions) into a pointer to its first element; the type of the resultant pointer is pointer-to-T.

[...]

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
pmg
  • 106,608
  • 13
  • 126
  • 198
8

In C++ according to the C++ Standard 4.2:

An lvalue or rvalue of type “array of N T” or “array of unknown bound of T” can be converted to an rvalue of type “pointer to T.” The result is a pointer to the first element of the array.

Kirill V. Lyadvinsky
  • 97,037
  • 24
  • 136
  • 212
  • @Kirill: I wonder what an rvalue of array type is. Can you come up with an example where the expression is of array type but is an rvalue? – Armen Tsirunyan Oct 18 '10 at 16:22
  • @Armen: I think that the ternary operator might do that. – Ben Voigt Oct 18 '10 at 17:48
  • @Ben: Could you provide a specific example please? – Armen Tsirunyan Oct 18 '10 at 18:04
  • @Armen: I guess that it is still an lvalue. http://ideone.com/WfLiQ – Ben Voigt Oct 18 '10 at 18:16
  • @Ben: I know, that's why I requested an example. It is lvalue if second and third operands have the same types and are both lvalues. However if second and third arguments are lvalues of array type with different sized/bounds and therefore types, the result is an rvalue, but rvalue of pointer type, not array type. Which leaves my request for an example of an array rvalue still open. – Armen Tsirunyan Oct 18 '10 at 18:20
  • @Armen, check [this](http://stackoverflow.com/questions/3656726/array-and-rvalue) question. – Kirill V. Lyadvinsky Oct 18 '10 at 19:05
  • @Kirill: That's what I thought. Then why didn't they phrase "an lvalue of type..." rather that "an lvalue or rvalue of type..."? – Armen Tsirunyan Oct 18 '10 at 19:23
  • 1
    @Armen: I see [you found your array rvalue](http://stackoverflow.com/questions/4058151/i-think-i-may-have-come-up-with-an-example-of-rvalue-of-array-type). – Ben Voigt Nov 02 '10 at 05:55
  • 1
    @Ben: Yes, a couple of days ago, I'm proud of it :) – Armen Tsirunyan Nov 02 '10 at 06:24
8

No, they are not implemented differently. Both find elements with the same calculation: a[i] is at address a + i*sizeof(a[0]), also p[i] is at address p + i*sizeof(p[0]).

But, they are treated differently by the type system. C++ has typing information on arrays which can be seen through the sizeof operator (like C), template inference, function overloading, RTTI, and so on. Basically anywhere in the language that type information is used, it is possible for pointers and arrays to behave differently.

There are many many examples in C++ where two different language concepts have the same implementation. Just a few: arrays vs pointers, pointers vs references, virtual functions vs function pointers, iterators vs pointers, for loops vs while loops, exceptions vs longjmp

In every case, there's a different syntax and a different way of thinking about the two concepts, but they result in the same machine code in the end.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
6

In C++ (and in C too I think), an array is not a pointer and that can be proven in the following way.

#include <iostream>
int main()
{
   char arr[1000];
   std::cout << sizeof arr;
}

if arr were a pointer this program would print sizeof (char*) which is typically 4. But it prints 1000.

another proof:

template <class T>
void f(T& obj)
{
   T x = obj; //this will fail to compile if T is an array type
}

int main()
{
   int a[30] = {};
   int* p = 0; 
   f(p); //OK
   f(a); //results in compile error. Remember f takes by ref therefore needs lvalue and no conversion applies
}

Formally, an array is converted to a pointer to its first element in lvalue-to-rvalue conversions, that is when an lvalue of array type is given in a context when an rvalue is expected, the array is converted to a pointer to its first element.

Also, a function declared to take an array by value is equivant to function taking a pointer, that is

void f(int a[]);
void f(int a[10]);
void f(int* a);

are three equivalent declarations. HTH

Armen Tsirunyan
  • 130,161
  • 59
  • 324
  • 434
  • 1
    `void f(int a[]);`, `void f(int a[10]);` and `void f(int* a);` are absolutely equivalent declarations in C too :) – pmg Oct 18 '10 at 15:14
  • I know they're equivalent in C, but I thought the C++ compiler was smart enough to distinguish them. In fact there's a difference only when the array is passed by reference, so in this case there's no protection whatsoever against passing the wrong size buffer even when the prototype specifies the required buffer. – Ben Voigt Oct 18 '10 at 15:22
  • @Ben: What the prototype (the correct term is declaration) specifies (as size) is completely ignored by the compiler. The parameter is a pointer – Armen Tsirunyan Oct 18 '10 at 15:24
  • @Armen: On the other hand, the so-called inability to pass an array by reference in your second example is complete nonsense. The only compile error you get is from using the wrong initializer syntax for an array. See http://ideone.com/kAqmX – Ben Voigt Oct 18 '10 at 16:09
  • @Ben: Plese refrain from words like nonsense. It is not nonsense. As is clearly seen from my answer the second example, just like the first one demonstrates that an array is not a pointer. If an array were a pointer the template function would be instantiated for T = int* and would compile successfully. However, since the parameter is passed by reference, the second call to f instantiates f for T = int[30] which results in a compile-time error because of the wrong initialization syntax. If the parameter were passed bu value, the second call to f would use the already inst-ed f for T = int* – Armen Tsirunyan Oct 18 '10 at 16:14
  • @Armen: The comment in that code `f(a); //results in compile error. Remember f takes by ref therefore needs lvalue and no conversion applies` is *wrong*. As you say in your comment, `T` is inferred as `int[30]`. The compiler error has nothing to do with lvalues or conversions. This can easily be seen by changing the parameters to `const T&`, so that an r-value is accepted. But `T` is still inferred as `int[30]`. – Ben Voigt Oct 18 '10 at 17:37
  • @Ben: The comment is correct. If a function takes by value and an lvalue is passed, lvalue-to-rvalue conversions take place (including array-to-pointer). If a function takes by reference (including a reference to const) and an lvalue is passed, no such conversion takes place. I do know that rvalues can be bound to references-to-const, but that fact has nothing to do with the examples. Any further objections? – Armen Tsirunyan Oct 18 '10 at 17:52
  • @Armen: Well then the comment is unclear. It makes it sound as if the function needs a pointer lvalue, but an array can only become a pointer rvalue, thus conversion fails, when actually the function takes an array lvalue just fine. I would propose the following instead: Inside f, a comment on the line `T x = obj;` that says this assignment works fine when T is a pointer, but is a compiler error for arrays, and then instead of the comment we've been discussing, just say that `T` is inferred as the array type `int [30]`. – Ben Voigt Oct 18 '10 at 18:11
  • @Armen: You claim that If a function takes by reference (including a reference to const) and an lvalue is passed, no such conversion takes place. Here is a short, clear counter-example: http://ideone.com/hOF3l – Ben Voigt Oct 18 '10 at 18:19
  • @Ben: Adding a comment at T x = obj; is a good idea. Also I may consider changing the wording of my comment to be more precise. As for your counterexample, well instead of "when an lvalue is passed" read "when an lvalue of the same type is passed". I assumed that it was obvious what I meant, apparenty it was not. – Armen Tsirunyan Oct 18 '10 at 18:25
  • @Armen: I guess all that I'm saying is this: That example is very good to show how the compiler does template type inference differently between pointer and array. It's really not about conversions. – Ben Voigt Oct 18 '10 at 18:57
  • @Ben: This "debate" is starting to get philosophical rather than technical. Let's leave the example alone. I considered your comments and decided to add a comment to line T x = obj; but leave everything else as is. – Armen Tsirunyan Oct 18 '10 at 19:26
  • @pmg, are equivalent but only as formal parameter declarations, and there's an excerpt in standard claiming for that as an exception to the rule. The standard sais all get implemented as pointers, but, what about if you printf sizeof the pointed to expression? – Luis Colorado Sep 19 '14 at 12:59
2

In C++ array type has a "size attribute", so for

T a[10];
T b[20];

a and b has different types.

This allows to use code like this

template<typename T, size_t N>
void foo(T (&a)[N])
{
   ...
}
Abyx
  • 12,345
  • 5
  • 44
  • 76
1

The biggest point of confusion between arrays and pointers comes from K&R's decision to make function parameters which are declared as being array type behave as though they were declared as pointers. The declarations

void foo(int a[]);
and
void foo(int *a);
are equivalent, as is (so far as I can tell)
void foo(int a[5]);
though I'm not positive a compiler would be required to accept a reference to a[6] within the latter function. In other contexts, an array declaration allocates space for the indicated number of elements. Note that given:
typedef int foo[1];

any declaration of a type foo will allocate space for one element, but any attempt to pass foo as a function parameter will instead pass the address. Something of a useful trick I learned in studying a va_list implementation.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • the three function declarations are identical: They all take a pointer to int. And any array or reference to array (of type int, naturally) or pointer to int can be passed to that function, because the function takes a pointer by value. – Armen Tsirunyan Oct 18 '10 at 16:25
  • 1
    @Armen Tsirunyan: I knew all three were identical in practice on all the compilers I've used; I didn't know whether a compiler would be allowed to check subscripts for validity; it seems goofy to have array bounds specifications that are totally ignored. Is the bound required to be non-negative? – supercat Oct 18 '10 at 17:42
  • Good question, never thought of it. I cannot find the relevant clauses in the standard right now but both MSVC9.0 and online Comeau mandate that the bound be positive – Armen Tsirunyan Oct 18 '10 at 17:57