36

Why can't you pass arrays as function arguments?

I have been reading this C++ book that says 'you can't pass arrays as function arguments', but it never explains why. Also, when I looked it up online I found comments like 'why would you do that anyway?' It's not that I would do it, I just want to know why you can't.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Hudson Worden
  • 2,263
  • 8
  • 30
  • 45
  • 8
    Please note the body of his text says "**by value**". – GManNickG Aug 16 '11 at 03:04
  • 4
    Probably because copying entire arrays for the sole purpose of passing it by value as a function argument will easily cause stack overflows and/or be hideously slow? – In silico Aug 16 '11 at 03:04
  • 2
    @In silico: Then why are you allowed to create arrays as local variables, which have the same *automatic* lifetime? – Ben Voigt Aug 16 '11 at 03:06
  • "because that's just how C [and C++ because of its roots] works" :-) of course, with C++, could std::vector or whatever... –  Aug 16 '11 at 03:06
  • 1
    @Ben Voigt: I don't see how that's relevant to the issue of passing arrays by value. What I'm trying to get at was that passing arrays by value would surely involve copying its contents into the stack frame, which is a waste of time and memory since passing a pointer to the first element gives you the same ability to access it but in a way that's easier, faster and less memory-intensive. – In silico Aug 16 '11 at 03:08
  • You don't know how large the array is. So how much space is the array going to take up on the stack? But similar to fn(int x, ...); If you could then it would have to be const because adding elements would be trouble. – QuentinUK Aug 16 '11 at 03:13
  • 2
    @In: But it's very unlike C to make such decisions. We're allowed to do almost anything else, what if I want to copy my array? Why is this rule subdued by merely putting the array in a struct? Your argument would apply to such a struct. – GManNickG Aug 16 '11 at 03:16
  • @In silico: I'm addressing your comment about stack overflow. Arrays as local variables cause stack overflow just as quickly as array parameters. Merely passing a pointer (or reference) has very different semantics, both in terms of aliasing, and changes outliving the function call. – Ben Voigt Aug 16 '11 at 05:03
  • @GMan: Not any more it doesn't. – Lightness Races in Orbit Aug 16 '11 at 09:25
  • @Ben Voigt: I see what you mean now. I stand corrected. – In silico Aug 16 '11 at 11:17
  • @GMan: I forgot about arrays in `struct`s. I blame it on my reliance on C++ containers. :-) – In silico Aug 16 '11 at 11:19

6 Answers6

47

Why can't arrays be passed as function arguments?

They can:

void foo(const int (&myArray)[5]) {
   // `myArray` is the original array of five integers
}

In technical terms, the type of the argument to foo is "reference to array of 5 const ints"; with references, we can pass the actual object around (disclaimer: terminology varies by abstraction level).

What you can't do is pass by value, because for historical reasons we shall not copy arrays. Instead, attempting to pass an array by value into a function (or, to pass a copy of an array) leads its name to decay into a pointer. (some resources get this wrong!)


Array names decay to pointers for pass-by-value

This means:

void foo(int* ptr);

int ar[10]; // an array
foo(ar);    // automatically passing ptr to first element of ar (i.e. &ar[0])

There's also the hugely misleading "syntactic sugar" that looks like you can pass an array of arbitrary length by value:

void foo(int ptr[]);

int ar[10]; // an array
foo(ar);

But, actually, you're still just passing a pointer (to the first element of ar). foo is the same as it was above!

Whilst we're at it, the following function also doesn't really have the signature that it seems to. Look what happens when we try to call this function without defining it:

void foo(int ar[5]);
int main() {
   int ar[5];
   foo(ar);
}

// error: undefined reference to `func(int*)'

So foo takes int* in fact, not int[5]!

(Live demo.)


But you can work-around it!

You can hack around this by wrapping the array in a struct or class, because the default copy operator will copy the array:

struct Array_by_val
{
  int my_array[10];
};

void func (Array_by_val x) {}

int main() {
   Array_by_val x;
   func(x);
}

This is somewhat confusing behaviour.


Or, better, a generic pass-by-reference approach

In C++, with some template magic, we can make a function both re-usable and able to receive an array:

template <typename T, size_t N>
void foo(const T (&myArray)[N]) {
   // `myArray` is the original array of N Ts
}

But we still can't pass one by value. Something to remember.


The future...

And since C++11 is just over the horizon, and C++0x support is coming along nicely in the mainstream toolchains, you can use the lovely std::array inherited from Boost! I'll leave researching that as an exercise to the reader.

Community
  • 1
  • 1
Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
14

So I see answers explaining, "Why doesn't the compiler allow me to do this?" Rather than "What caused the standard to specify this behavior?" The answer lies in the history of C. This is taken from "The Development of the C Language" (source) by Dennis Ritchie.

In the proto-C languages, memory was divided into "cells" each containing a word. These could be dereferenced using the eventual unary * operator -- yes, these were essentially typeless languages like some of today's toy languages like Brainf_ck. Syntactic sugar allowed one to pretend a pointer was an array:

a[5]; // equivalent to *(a + 5)

Then, automatic allocation was added:

auto a[10]; // allocate 10 cells, assign pointer to a
            // note that we are still typeless
a += 1;     // remember that a is a pointer

At some point, the auto storage specifier behavior became default -- you may also be wondering what the point of the auto keyword was anyway, this is it. Pointers and arrays were left to behave in somewhat quirky ways as a result of these incremental changes. Perhaps the types would behave more alike if the language were designed from a bird's-eye view. As it stands, this is just one more C / C++ gotcha.

Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • 1
    I'm not convinced that this explains why arrays can't be copied. – Lightness Races in Orbit Aug 16 '11 at 09:11
  • params keyword in c# does this almost! – WinW Aug 16 '11 at 09:14
  • @Tomalak: The reason why they can't be copied is in the standard. This is historical, not normative information. – Dietrich Epp Aug 17 '11 at 01:35
  • @Kiran: Yes, other languages have a more carefully architected difference between reference types and value types that doesn't need to accomodate half a century of legacy code. – Dietrich Epp Aug 17 '11 at 01:37
  • @Dietrich: I'm aware that it's historical; I mean in that context. It's an interesting answer to be sure, but I don't quite see the link between it and why it historically meant no copying of arrays. It may well just be me; can you help? :) – Lightness Races in Orbit Aug 17 '11 at 23:22
  • @Tomalak: Imagine that all variables are machine words. You can dereference them with the `*` operator or access array elements with `x[i]` syntax, which is shorthand for `*(x+i)`. When you pass a variable to a function, you pass the value with no deep copy, so you cannot pass the contents of an array. It is convenient to allocate stack space, hence the `auto` keyword. However, at this point it just initializes a local with a pointer to the stack. It's still a pointer and the memory it points to won't be copied. – Dietrich Epp Aug 18 '11 at 01:01
  • In summary, in proto-C the only purpose of marking a variable as an array was to cause the compiler to automatically allocate storage and initialize the variable to point to that storage. Think of it as a predecessor to modern RAII, rather than a distinct type (because these languages were initially typeless). – Dietrich Epp Aug 18 '11 at 01:08
5

Arrays are in a sense second-class types, something that C++ inherited from C.

Quoting 6.3.2.1p3 in the C99 standard:

Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type "array of type" is converted to an expression with type "pointer to type" that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.

The same paragraph in the C11 standard is essentially the same, with the addition of the new _Alignof operator. (Both links are to drafts which are very close to the official standards. (UPDATE: That was actually an error in the N1570 draft, corrected in the released C11 standard. _Alignof can't be applied to an expression, only to a parenthesized type name, so C11 has only the same 3 exceptions that C99 and C90 did. (But I digress.)))

I don't have the corresponding C++ citation handy, but I believe it's quite similar.

So if arr is an array object, and you call a function func(arr), then func will receive a pointer to the first element of arr.

So far, this is more or less "it works that way because it's defined that way", but there are historical and technical reasons for it.

Permitting array parameters wouldn't allow for much flexibility (without further changes to the language), since, for example, char[5] and char[6] are distinct types. Even passing arrays by reference doesn't help with that (unless there's some C++ feature I'm missing, always a possibility). Passing pointers gives you tremendous flexibility (perhaps too much!). The pointer can point to the first element of an array of any size -- but you have to roll your own mechanism to tell the function how big the array is.

Designing a language so that arrays of different lengths are somewhat compatible while still being distinct is actually quite tricky. In Ada, for example, the equivalents of char[5] and char[6] are the same type, but different subtypes. More dynamic languages make the length part of an array object's value, not of its type. C still pretty much muddles along with explicit pointers and lengths, or pointers and terminators. C++ inherited all that baggage from C. It mostly punted on the whole array thing and introduced vectors, so there wasn't as much need to make arrays first-class types.

TL;DR: This is C++, you should be using vectors anyway! (Well, sometimes.)

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
2

Arrays are not passed by value because arrays are essentially continuous blocks of memmory. If you had an array you wanted to pass by value, you could declare it within a structure and then access it through the structure.

This itself has implications on performance because it means you will lock up more space on the stack. Passing a pointer is faster because the envelope of data to be copied onto the stack is far less.

ccozad
  • 1,119
  • 8
  • 13
  • 4
    "Arrays are not passed by value because arrays are essentially continuous blocks of memmory." Um, could I not say the same for structs? – GManNickG Aug 16 '11 at 03:17
  • 2
    @Hudson: Please wait more than 7 minutes before accepting an answer. Allow people to work out which is correct; at least a day. – GManNickG Aug 16 '11 at 03:18
  • Yes you can say the same for structs. Difference being that arrays are homogenous and structrures are not. I can also say that a normal hammer and a sledge hammer are both hammers, and yet they serve different purposes. – ccozad Aug 16 '11 at 03:20
  • @GMan Agreed, let the debates roll. – ccozad Aug 16 '11 at 03:21
  • 1
    @ccozad: I don't see how homogeneity matters. – GManNickG Aug 16 '11 at 03:29
  • This is what I have just been thinking and I have no facts to confirm this. Please correct me if I'm wrong but is it because an array itself is just a reference to a continuous series of values? – Hudson Worden Aug 16 '11 at 03:34
  • 2
    @Hudson: An array *is* a contiguous collection of elements. Given `int a[10];`, the name `a` has the type `int[10]`, and is a contiguous collection of ten integers. – GManNickG Aug 16 '11 at 03:40
  • Homogenity was mentioned because though in some ways similar, an array and a structure are different and serve different purposes. If you want the performance benefits of arrays they come with the pass by referece usage model. – ccozad Aug 16 '11 at 04:26
  • Further research has pointed to the fact that the duality between not being able to pass an array by value, but being able to pass a struct that contained an array by value comes from backwards compatibility. C established very early that that array was a pointer to the first element. Rather than change basic assumptions (and break older code) the original definition was kept with additional language elements for when that did not suffice. – ccozad Aug 16 '11 at 04:40
  • 4
    @ccozad: No, an array is *not* a pointer to the first element. Arrays are not pointers; pointers are not arrays. See the quote from the C99 standard in my answer. – Keith Thompson Aug 16 '11 at 15:37
1

I believe that the reason why C++ did this was, when it was created, that it might have taken up too many resources to send the whole array rather than the address in memory. That is just my thoughts on the matter and an assumption.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
nhutto
  • 163
  • 5
1

It's because of a technical reason. Arguments are passed on the stack; an array can have a huge size, megabytes and more. Copying that data to the stack on every call will not only be slower, but it will exhaust the stack pretty quickly.

You can overcome that limitation by putting an array into a struct (or using Boost::Array):

struct Array
{
    int data[512*1024];
    int& operator[](int i) { return data[i]; }
};
void foo(Array byValueArray) { .......... }

Try to make nested calls of that function and see how many stack overflows you'll get!

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
hamstergene
  • 24,039
  • 5
  • 57
  • 72