10

C and C++ allows passing of structure and objects by value to function, although prevents passing arrays by values.

Why?

Kazoom
  • 5,659
  • 16
  • 56
  • 69

7 Answers7

18

In C/C++, internally, an array is passed as a pointer to some location, and basically, it is passed by value. The thing is, that copied value represents a memory address to the same location.

In C++, a vector<T> is copied and passed to another function, by the way.

Mehrdad Afshari
  • 414,610
  • 91
  • 852
  • 789
  • Best answer! I think though it should emphasis also that C/C++ does not *prevent* passing the array by value. – hasen Mar 22 '09 at 14:39
  • 5
    This answer confuses arrays with pointers. It may sound simple, but it indeed causes confusion i think. It's important to keep the concepts of arrays and pointers apart, to understand their fundamental difference in the first place. – Johannes Schaub - litb Mar 22 '09 at 14:46
  • @litb: Well, yeah, it's pretty important to know that it's not precisely the same as passing a pointer from the compiler's point of view (e.g. multidimensional arrays), but it's equally important to emphasis that the actual array address is *copied*, not passed by reference. – Mehrdad Afshari Mar 22 '09 at 14:55
  • An array is indeed more than a pointer, otherwise for char a[1]; sizeof(a) would equal sizeof(char*) instead of being 1. –  Mar 22 '09 at 15:22
  • @Neil: I think that's exactly the thing `litb` pointed out. The real problem is the concept you might want to define as an array. While I admit litb's comment is perfectly valid, I'm not able to explain it in a more precise way. I think it's already obvious what I mean. – Mehrdad Afshari Mar 22 '09 at 15:26
  • Yes, i think that is the fundamental problem. Both of us are right i think. Some talk about the concept of an array, and thus include buffers allocated at the heap and pointed to by a pointer, and some talk about declared arrays as they exist in C and C++ as aggregate objects. – Johannes Schaub - litb Mar 22 '09 at 15:40
  • @litb: I think the problem is that new always returns a char* even for arrays. It would be better if it returned an int (*)[10] when you allocated that, but then you wouldn't be allowed to have dynamic allocations. I think you could hack something together with templates to get typesafe new – Brian R. Bondy Mar 22 '09 at 16:02
  • Brian, yeah i agree, that special handling somewhat knuddles pointers and arrays together (btw, here is teh pet peeve http://stackoverflow.com/questions/423823/whats-your-favorite-programmer-ignorance-pet-peeve/484900#484900 :p) – Johannes Schaub - litb Mar 22 '09 at 16:14
  • nice :), I don't think everyone understands that new doesn't return an array it returns a pointer to the first element of an array. The question on this page asks about arrays though, not pointers to the first element of an array. – Brian R. Bondy Mar 22 '09 at 16:19
  • @Brian: All goes back to the definition of the array. Is it the buffer or the concept as litb pointed out? Isn't the pointer of the first element called an array? or is it? Is a vector an array (as it stores a sequence of elements linearly, which is one definition of an array)? – Mehrdad Afshari Mar 22 '09 at 16:33
  • @Mehrdad: I think the definition of an "array type" is a variable that has a pointer to the first element and a known size. Otherwise what you have is just a pointer. – Brian R. Bondy Mar 22 '09 at 16:37
  • A buffer is what an array pointers to, but I don't think of it as part of the array type. – Brian R. Bondy Mar 22 '09 at 16:37
  • It's not part of an array *type*, but it can probably be considered a part of an array. You are certainly right about array type, and the way compiler treats arrays. Anyhow, I think it does not matter that much but the real controversial thing is the definition. – Mehrdad Afshari Mar 22 '09 at 16:42
  • Ya I guess basically if you consider array == array type. – Brian R. Bondy Mar 22 '09 at 16:51
  • Mehrdad, it's quite clear what an array is. a pointer is not an array. a vector is not, a boost::array is not... just a T[N] is. the other things either emulate an array, or model the conceptual array. But they are not real arrays. – Johannes Schaub - litb Mar 22 '09 at 16:51
  • if you say an array is nothing more than a pointer, then that's plain wrong. a lambda expression then is nothing more than a chunk of bits. everything is the same then, if you abstract the language rules away and look at them at the assembler/machine code level – Johannes Schaub - litb Mar 22 '09 at 16:53
  • @litb: Right. That statement should be clarified I think... This one is better. – Mehrdad Afshari Mar 22 '09 at 17:01
  • @litb: Your last comment is terrific. It made me thinking of an answer like "an array is a sequence of electrons that get around the machine :))" – Mehrdad Afshari Mar 22 '09 at 17:23
  • 1
    @Mehrdad: I know what you're trying to say, but I think your current explanation is very close to wrong. The problem is that it blurs the line -- *all* cases where pointer semantics are used instead of value semantics can be explained as "well, it's really value semantics on the underlying address". – j_random_hacker Mar 23 '09 at 03:59
  • Yes, but isn't it the right thing nevertheless?! I mean, considering the question is a design decision on the way C works, and not a how-to question, I think it's OK to give the answer "this is basically the only way *C* works, everything is call by value, but that thing is now an address" – Mehrdad Afshari Mar 23 '09 at 10:28
  • Mehrdad, hehe great i made you think of something fun :p dunno, i thought as you have it now it makes sense - the pointer is passed by value. after all it is, i think. – Johannes Schaub - litb Mar 23 '09 at 16:06
  • C has two things: the "locator value" (= lvalue) and the "value of an expression" (= rvalue), as it terms those in a note. While there are rvalue arrays (arrays wrapped in a struct and returned from a function), the "value of an array in an expression" is intended to be the pointer i believe... – Johannes Schaub - litb Mar 23 '09 at 16:10
  • @Mehrdad: Actually your answer is correct, it's just a question of emphasis. My reading of your answer (and maybe I'm alone :) ) is that you're trying to say "C *does* use value semantics everywhere" (using a broadened definition of "value semantics") which is confusing I think. :/ – j_random_hacker Mar 24 '09 at 13:48
  • #1. "*an array is passed as a pointer to some location*" - That it is not entirely correct in two contexts. 1. An array is not passed as a pointer at all. Before the argument is passed to the function, the array is evaluated to a pointer to its first element. An array, as is, is never passed. 2. An array, given as argument, decays to a pointer to the address of its first element, It does not point to "*some location*" in the manner it would point to an arbitrary/random memory location. – RobertS supports Monica Cellio Jun 11 '20 at 09:25
  • #2. "*it is passed by value.*" - That belongs to the first point of the first complain. The array is never passed *as it is*. Before tha argument is passed to the function, the array is evaluated to a pointer to its first element. – RobertS supports Monica Cellio Jun 11 '20 at 09:25
  • #3. "*The thing is, that copied value represents a memory address to the same location.*" - Also not fully correct. The passed pointer value represents the *same* memory address of the first element (the location) of the array, not a memory address to the same location. The latter suggests that several memory addresses could denote the same location, which is wrong. – RobertS supports Monica Cellio Jun 11 '20 at 09:28
  • @j_random_hacker I would not go as far as you and call this answer correct. It is not a matter of emphasis, when the provided statements are plain incorrect as illustrated in my previous comments. – RobertS supports Monica Cellio Jun 11 '20 at 09:35
11

You can pass an array by value, but you have to first wrap it in a struct or class. Or simply use a type like std::vector.

I think the decision was for the sake of efficiency. One wouldn't want to do this most of the time. It's the same reasoning as why there are no unsigned doubles. There is no associated CPU instruction, so you have to make what's not efficient very hard to do in a language like C++.

As @litb mentioned: "C++1x and boost both have wrapped native arrays into structs providing std::array and boost::array which i would always prefer because it allows passing and returning of arrays within structs"

An array is a pointer to the memory that holds that array and the size. Note it is not the exact same as a pointer to the first element of the array.

Most people think that you have to pass an array as a pointer and specify the size as a separate parameter, but this is not needed. You can pass a reference to the actual array itself while maintaining it's sizeof() status.

//Here you need the size because you have reduced 
// your array to an int* pointing to the first element.
void test1(int *x, int size)
{
  assert(sizeof(x) == 4);
}

//This function can take in an array of size 10
void test2(int (&x)[10])
{
  assert(sizeof(x) == 40);
}

//Same as test2 but by pointer
void test3(int (*x)[10])
{
  assert(sizeof(*x) == 40);
  //Note to access elements you need to do: (*x)[i]
}

Some people may say that the size of an array is not known. This is not true.

int x[10];  
assert(sizeof(x) == 40);

But what about allocations on the heap? Allocations on the heap do not return an array. They return a pointer to the first element of an array. So new is not type safe. If you do indeed have an array variable, then you will know the size of what it holds.

Brian R. Bondy
  • 339,232
  • 124
  • 596
  • 636
  • 1
    You see that [10] in the function parameter? That's where the size is compimg from. The reference has nothing to do with it. –  Mar 22 '09 at 14:35
  • it's more type safe than the test1 – Brian R. Bondy Mar 22 '09 at 14:37
  • @Neil Butterworth: The size of an array is part of the array type itself. If you wanted variable size you'd be better off with an std::vector. – Brian R. Bondy Mar 22 '09 at 14:39
  • 1
    You said that "people think you need to specify the size". People are right, as your needlessly complex code demonstrates. –  Mar 22 '09 at 14:46
  • It has everything to do with it. Had you said void test2(int x[10]); you would pass nothing more than a simple pointer, and the "10" there is completely ignored and useless (even dangerous). Now the reference accepts the parameter by-reference and avoids conversion of the argument to pointer. – Johannes Schaub - litb Mar 22 '09 at 14:49
  • @Neil Butterworth: Thanks for pointing out what needed clarification. I corrected my description to say "specify the size as a separate parameter". My above code demonstrates that there is a difference between a pointer to the first element and a pointer to an array. – Brian R. Bondy Mar 22 '09 at 14:50
  • But you still need to know the size! My point is that using sizeof() in the function is pointless because you already know the size of tehe array is 10. –  Mar 22 '09 at 14:53
  • I think the above nicely demonstrates what the difference between a pointer and an array pointer is. The second variant does not pass the size, it is just specifying the type. Which happens to contain the size. It is important to know because someone might use code as in test1 with sizeof. – Brian R. Bondy Mar 22 '09 at 14:57
  • Brian yeah i like that sample too. look at bottom of http://stackoverflow.com/questions/275994/whats-the-best-way-to-do-a-backwards-loop-in-c-c-c/276053#276053 too for a sizeof replacement for arrays. – Johannes Schaub - litb Mar 22 '09 at 15:06
  • worthwhile to mention C++1x and boost both have wrapped native arrays into structs providing std::array and boost::array which i would always prefer because it allows passing and returning of sarray within structs – Johannes Schaub - litb Mar 22 '09 at 15:09
  • @Neil: It is in fact possible to capture the size of the array without specifying it in the function declaration by turning the function into a function template, with the array size as a non-type (i.e. integral) template parameter. – j_random_hacker Mar 23 '09 at 04:02
8

EDIT: I've left the original answer below, but I believe most of the value is now in the comments. I've made it community wiki, so if anyone involved in the subsequent conversation wants to edit the answer to reflect that information, feel free.

Original answer

For one thing, how would it know how much stack to allocate? That's fixed for structures and objects (I believe) but with an array it would depend on how big the array is, which isn't known until execution time. (Even if each caller knew at compile-time, there could be different callers with different array sizes.) You could force a particular array size in the parameter declaration, but that seems a bit strange.

Beyond that, as Brian says there's the matter of efficiency.

What would you want to achieve through all of this? Is it a matter of wanting to make sure that the contents of the original array aren't changed?

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • char x[10]; assert(sizeof(x) == 10); – Brian R. Bondy Mar 22 '09 at 14:23
  • Brian is right. arrays in C++ and C89 always have sizes known at compile time :) but i think your answer still has to do with it. there are many situations where an array decays into a pointer. and making arrays pass to functions would for one dangerous (precisely because of that decay) ... – Johannes Schaub - litb Mar 22 '09 at 14:29
  • and useless to a certain degree, because we would be limited for passing one size (i.e int[42] - but not int[41]). – Johannes Schaub - litb Mar 22 '09 at 14:30
  • i believe your answer points out exactly that, just from a point of view which expects to make it work to accept different array sizes but which would not work in C++/C because it would need to accept array sizes that have different sizes. so i +1 anyway :D – Johannes Schaub - litb Mar 22 '09 at 14:33
  • The size of the array is part of the type itself. If you wanted variable size you'd be better off with an std::vector. – Brian R. Bondy Mar 22 '09 at 14:38
  • @litb: arrays certainly do *not* always have know sizes at compile time! An array allocated on the heap, for instance, can have an arbitrary size that varies depending on the state and inputs of the program. – thehouse Mar 22 '09 at 14:52
  • i meant when the argument has an array type, then that size has to be known. passing an array on the heap around by value wouldn't be possible because one would need the size at compile time. now i see what jon means, thanks thehouse :) – Johannes Schaub - litb Mar 22 '09 at 15:03
  • @thehouse: new loses some type information, i.e. it is not type safe for arrays. It will cast away the array part of the pointer. But if you indeed have a typesafe variable that holds an array, it will always know its size. – Brian R. Bondy Mar 22 '09 at 16:15
  • ...The type of an array includes its size. But it can be reduced to a simple pointer to the first element, hence losing the size. My point is that arrays and pointers are distinct beasts, but often treated together. – Brian R. Bondy Mar 22 '09 at 16:24
  • Is this answer still useful? I'm not convinced it is, but I'm loathe to delete it when there's potentially useful information in the comments. Suggestions anyone? – Jon Skeet Mar 22 '09 at 18:26
  • @Jon: I would leave it for the comments. About your last comment: you are right, if the user does not want the contents to be changed, the language (C++, not C) has that facility through the use of the const keyword. – David Rodríguez - dribeas Mar 22 '09 at 20:13
  • Okay, I'll leave it but put a note in the answer to point to the comments :) – Jon Skeet Mar 22 '09 at 20:45
7

I think that there 3 main reasons why arrays are passed as pointers in C instead of by value. The first 2 are mentioned in other answers:

  • efficiency
  • because there's no size information for arrays in general (if you include dynamically allocated arrays)

However, I think a third reason is due to:

  • the evolution of C from earlier languages like B and BCPL, where arrays were actually implemented as a pointer to the array data

Dennis Ritchie talks about the early evolution of C from languages like BCPL and B and in particular how arrays are implemented and how they were influenced by BCPL and B arrays and how and why they are different (while remaining very similar in expressions because array names decay into pointers in expressions).

Michael Burr
  • 333,147
  • 50
  • 533
  • 760
3

I'm not actually aware of any languages that support passing naked arrays by value. To do so would not be particularly useful and would quickly chomp up the call stack.

Edit: To downvoters - if you know better, please let us all know.

  • C++ vector<...> objects, when passed by value, are copied. And although this wastes (heap-)memory, it won't "quickly chomp up the call stack". – vog Mar 22 '09 at 14:58
  • The native array inside a vector is passed by reference - it's a pointer. C++ native arrays are exactly the same as C arrays. –  Mar 22 '09 at 15:02
  • MATLAB has value semantics even for arrays. This is one reason why non-trivial matlab programs often use a huge amount of memory, though current versions of the interpreter use copy-on-write to reduce the memory usage. – janneb Mar 22 '09 at 15:04
  • Copy on write is passing by ref (R does the same). The fact the array may grow later (depending on use) is not the issue. –  Mar 22 '09 at 15:07
  • i agree with Neil. i know no language which can pass arrays by value. C# and Java pass them not at all (not even by reference), C++ can pass them by reference only. And C can't pass them at all either, only passing a pointer to their first element. – Johannes Schaub - litb Mar 22 '09 at 15:10
  • though, is passing arrays contained in structs passing-of-arrays? probably one could argue about that hours long :D – Johannes Schaub - litb Mar 22 '09 at 15:12
  • By native, I meant naked - I'll change my answer. –  Mar 22 '09 at 15:17
  • No, as I said, MATLAB has value semantics, not reference semantics. When you call a function, semantically you get copies of the arguments rather than references to them. That the interpreter can defer copying is an implementation detail; perhaps I shouldn't have brought that up, confusing the issue – janneb Mar 22 '09 at 16:54
  • i don't know any language that does this, but certainly there are languages that do so. not saying you are wrong. just saying i don't know one that do so. i imagine it could have benefits where language do long calcs with the arguments - aliasing could matter there. thanks for the insignt janneb – Johannes Schaub - litb Mar 22 '09 at 17:05
  • now that you told me matlab has value semantics, indeed i will have to correct myself and say i know at least one language that does :) – Johannes Schaub - litb Mar 22 '09 at 17:07
  • You can pass arrays in Ada as "in" arguments. This is conceptually like passing them by value. However you can't modify "in" parameters. And certainly, under the sheets, the compiler is passing them by reference for efficiency. – Brian Neal Mar 22 '09 at 17:43
  • @vog: Passing by value consumes stack. Passing vectors by value does not in so much as the memory is indeed allocated on the heap. The vector internals are copied, the copy constructor will allocate new heap memory and copy the arrays. No reference to the original is passed. – David Rodríguez - dribeas Mar 22 '09 at 20:20
  • 'copy the arrays' should be: copy the array contents. @janneb: Copy on write is quite troublesome in multithreaded programs. Either you enforce locking on each access to the element affecting performance, or you end up playing russian rulette with pointers. – David Rodríguez - dribeas Mar 22 '09 at 20:23
  • @dribeas: That might be; I'm not arguing that the MATLAB way is better or worse, just saying how it is. – janneb Mar 22 '09 at 21:51
  • @Brian Neal: In Fortran you can set the intent(in) attribute for arguments, which is similar to Ada. But as you say yourself, this is not really value semantics. More like passing objects via const reference in C++. – janneb Mar 22 '09 at 21:52
3

This is one of those "just because" answers. C++ inherited it from C, and had to follow it to keep compatibility. It was done that way in C for efficiency. You would rarely want to make a copy of a large array (remember, think PDP-11 here) on the stack to pass it to a function.

Brian Neal
  • 31,821
  • 7
  • 55
  • 59
0

from C How To Program-Deitel p262

.. "C automatically passes arrays to functions by reference. The array’s name evaluates to the address of the array’s first element. Because the starting address of the array is passed, the called function knows precisely where the array is stored. Therefore, when the called function modifies array elements in its function body, it’s modifying the actual elements of the array in their original memory locations. "

this helped me, hope it helps you too