19

Consider:

int sum(const int numbers[], const int size){
    if (size == 0)
        return 0;
    else
        return numbers[0] + sum(numbers+1, size-1);
}

This is a simple recursive function from MIT 6.096 for adding an arbitrary number of integers, and it works.

The thing I cannot understand is in the last line:

How does numbers+1 work, given numbers[] is an int array and you shouldn't be able to add an integer to an int[] constant?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
  • 11
    This is C code. Using a recursive function to calculate the sum of an array is really the worst possible example they could have picked. Who writes examples like this? A recursive function is best used on a necessarily recursive algorithm. – tadman Mar 28 '16 at 15:11
  • 7
    Here, `const int numbers[]` is the same as `const int* numbers`: A non constant pointer to constant values –  Mar 28 '16 at 15:12
  • @tadman for some reason, on some compilers, if they spot a tail recursion, they generate faster code than they would for a loop. However, I agree it is not easy reading. – Tom Tanner Mar 28 '16 at 15:57
  • 4
    @TomTanner I'd really like to see some benchmarks for that because it sounds like an urban legend to me. – tadman Mar 28 '16 at 15:58
  • 4
    @TomTanner It doesn't really matter here because this example isn't tail-recursive. – Random832 Mar 28 '16 at 17:29
  • @Random832: Although the conversion to tail recursion is straightforward enough that some optimizers will do it automatically. – Ben Voigt Mar 28 '16 at 19:36
  • this is non trivial conversion to tail recursion, because it has to take in account that + is associative operation. – Luka Rahne Mar 29 '16 at 09:07

5 Answers5

21

how does "numbers+1" work, given numbers[] is an int array and you shouldn't be able to add an integer to an int[] constant?

There's no int[] constant. numbers is decayed to a pointer and numbers+1 is simple pointer arithmetic applied to the parameter passed to the recursive call.

πάντα ῥεῖ
  • 1
  • 13
  • 116
  • 190
  • 14
    "`numbers` is decayed to a pointer". No it isn't. It is declared as a pointer. The token sequence `const int numbers[]`, *when it appears in a formal parameter list*, declares a pointer and not an array. – Ben Voigt Mar 28 '16 at 19:32
  • 2
    From 8.3.5p5: "The type of a function is determined using the following rules. The type of each parameter (including function parameter packs) is determined from its own *decl-specifier-seq* and *declarator* . After determining the type of each parameter, any parameter of type 'array of `T`' or 'function returning `T`' is adjusted to be 'pointer to `T`' or 'pointer to function returning `T`', respectively." – Ben Voigt Mar 28 '16 at 19:34
  • @BenVoigt I feel like πάντα ῥεῖ is saying that `numbers` decays to an `int*` to be passed into `int sum(const int numbers[], const int size)`. If you disagree with that perhaps you should answer separately illustrating the delineation. – Jonathan Mee Mar 28 '16 at 19:56
  • @JonathanMee: Perhaps the initial non-recursive call to `sum` had a real array, which decayed in order to pass the pointer parameter. But `numbers` is not itself an array, it is a pointer, even though array declaration notation was used, because of the adjustment rule I quoted (it's there for backward compatibility with C89). – Ben Voigt Mar 28 '16 at 20:19
  • 3
    @BenVoigt Is right. `numbers` - the function parameter - isn't an array, thus it doesn't get implicitly converted ("decays") to a pointer. It always was a pointer. – Daniel Jour Mar 28 '16 at 21:01
  • @DanielJour So my interpretation of πάντα ῥεῖ's answer is that the array passed into the function decayed *before it was passed* in to `sum`. So for the lifetime of `sum`, `number`s *is* an `int*`. Do we all agree on this? – Jonathan Mee Mar 29 '16 at 00:23
  • 2
    @JonathanMee Sure do, it is in the std (13.1.3): _Parameter declarations that differ only in a pointer `*` versus an array `[]` are equivalent. That is, the array declaration is adjusted to become a pointer declaration. Only the second and subsequent array dimensions are significant in parameter types_ – Chris A Mar 29 '16 at 09:12
11

As a side note to @πάντα ῥεῖ's answer, here are a few clarifications on the terminology:

The following is another way to depict array notation:

The phrase numbers[1] can also be expressed as *(numbers + 1) Where the * operator is said to dereference the pointer address numbers + 1. dereference can be thought of in this case as read the value pointed to by.

So, the code in your example is using pointer arithmetic. The phrase numbers + 1 is pointer notation, pointing to the second int location of the pointer numbers. size - 1 is a count of bytes from the memory location starting at numbers to the end of the array.

As to the meaning of decayed:
Generally, within the context of C array arguments, decay conveys the idea that the array argument experiences loss of type and dimension information. Your const int numbers[] is said (arguably) to decay into an int *, therefore no longer able to provide array size information. (Using the sizeof() macro for example does not provide length of array, but size of the pointer.) This also is the reason a second argument is provided, to convey size information.

However, in the context of this question, the meaning of decay is academic, as pointed out by @Ben Voigt: The token sequence const int numbers[], when it appears in a formal parameter list, declares a pointer and not an array. (It never decayed into a pointer because it was a pointer to begin with.)

Community
  • 1
  • 1
ryyker
  • 22,849
  • 3
  • 43
  • 87
  • 1
    I was about to hijack this question but then found some info [here](http://www.tutorialspoint.com/cprogramming/c_pointer_arithmetic.htm) So C/C++ is smart enough to figure out the actual address given the type of the array? For example if `numbers` is at location `C00` then `numbers+1` isn't actually `C01` but `C04` because an `int` is 4 bytes long? I was a bit suprised you didn't need to do `numbers+1*sizeof(int)` or something. – Celeritas Mar 29 '16 at 08:04
  • 1
    @Celeritas - LOL, hijack? should I be frightened? :) _So C/C++ is smart enough to figure out the actual address given the type of the array?_. True, when a pointer is passed as an argument, the function prototype conveys the type information for that pointer. Regarding: _then numbers+1 isn't actually C01 but C04_. Yes, if in that implementation an int is defined having 32 bits. That is often true, but not always. The address is changed by increments (or decrements) of the _variable type_ it is associated with, however that variable type is defined. – ryyker Mar 29 '16 at 13:17
  • I guess I'm surprised that C/C++ is friendly enough/smart enough to do this automatically. – Celeritas Mar 29 '16 at 21:16
4

As πάντα ῥεῖ says int[] decays to int*.

But this sum function is the poor man's solution, you should prefer accumulate:

cout << accumulate(numbers, next(numbers, size), decay_t<decltype(numbers[0])>{});

Live Example

If you have C++17 and a statically allocated array, such as int numbers[size], you can take advantage of cbegin and cend:

cout << accumulate(cbegin(numbers), cend(numbers), decay_t<decltype(numbers[0])>{});

I've attempted to benchmark the recursive sum against accumulate, however sum runs out of stack space before I am able to reach a vector size with a meaningful difference, making accumulate the clear winner.


I associate the type of accumulate's init agument with the type of numbers' elements: decay_t<decltype(numbers[0])>{}. The reason for this is if someone was to come back and change the type of numbers, and not change the type of accumulate's init argument the accumulation would be assigned to the wrong type.

For example if we use the accumulation line: cout << accumulate(cbegin(numbers), cend(numbers), 0), this is fine for int numbers[]. The problem would arise if we switched to define: double numbers[] = {1.3, 2.3, 3.3, 4.3}; but we failed to change the init argument we would sum doubles into an int. This would result in 10 rather than 11.2: http://ideone.com/A12xin

Community
  • 1
  • 1
Jonathan Mee
  • 37,899
  • 23
  • 129
  • 288
  • 8
    Yeah cos `decay_t` is super-expressive and easy to understand. WTG C++! – Lightness Races in Orbit Mar 28 '16 at 16:14
  • @BarryTheHatchet I'm assuming that your comment was sarcastic, cause in my mind decay_t is not easy to understand. And certainly we could use `0` since we know `numbers` is an `int[]`, but using `decay_t{}` leaves this line agnostic to the type of `numbers`, which is a good thing, for example if it were changed to `float numbers[size]` I wouldn't have to change this line to make sure I was adding `floats` instead of `ints`. – Jonathan Mee Mar 28 '16 at 16:37
  • 3
    Extremely sarcastic. C++ is terrible. – Lightness Races in Orbit Mar 28 '16 at 16:42
  • 1
    @BarryTheHatchet: Yeah, unless you're deliberately trying to obfuscate your code (which IMHO is the only rationale for about 2/3 of C++ 'features'), use a for loop. – jamesqf Mar 28 '16 at 17:55
  • @BarryTheHatchet Ouch, I resemble that remark. This was the best way I knew to associate the type. I've posted a question to see if someone knows a better way: http://stackoverflow.com/questions/36268132/is-there-a-shortcut-to-decltype – Jonathan Mee Mar 28 '16 at 18:07
  • @jamesqf I'm not sure what else is available as far as strongly-typed languages. Is there perhaps an example from another language of how to associate a type in a less "obfuscated" way? – Jonathan Mee Mar 28 '16 at 18:08
  • @JonathanMee You *resemble* that remark? Am I misunderstanding something or should it it be *resent* that remark? – James Adkison Mar 28 '16 at 18:17
  • @JonathanMee: I'm confused; a moment ago you appeared to be agreeing with me. – Lightness Races in Orbit Mar 28 '16 at 18:18
  • 1
    @JamesAdkison: It's a common, humourous, deliberate error. – Lightness Races in Orbit Mar 28 '16 at 18:18
  • @JamesAdkison As someone who's horse is tied to C++'s cart, I'm going where it's going; thus: [I resemble that remark](http://www.urbandictionary.com/define.php?term=I+Resemble+That+Remark+) – Jonathan Mee Mar 28 '16 at 18:20
  • @BarryTheHatchet Okay, then I *wasn't getting it*. Thank you for the clarification. – James Adkison Mar 28 '16 at 18:20
  • @JonathanMee Thank you for the clarification. – James Adkison Mar 28 '16 at 18:22
  • @BarryTheHatchet I *do* agree with you. That's much harder to read than it should be to just associate with the type of `number`'s elements. My linked question was hoping for some clarity. – Jonathan Mee Mar 28 '16 at 18:22
  • After reading about `decay_t` I have the question, is it really necessary in this case? Or is it just a security measure in case the array doesn't contain plain `int`s? What difference would it make, what decay could be necessary? Is it that C++ wouldn't zero-init lvalues? The language can be complicated. – Felix Dombek Mar 28 '16 at 18:24
  • @FelixDombek I could use [`remove_extent_t`](http://en.cppreference.com/w/cpp/types/remove_extent) instead of [`decay_t`](http://en.cppreference.com/w/cpp/types/decay) if I knew that it was defined: `int numbers[]`. But `decay_t` will handle if I'm working with consts or references, as well as calling `remove_extent` on the type of `numbers`. – Jonathan Mee Mar 28 '16 at 18:29
  • 1
    Isn't the extent already removed because you do `decltype(numbers[0])`, so an element in the array which doesn't have an extent? – Felix Dombek Mar 28 '16 at 18:33
  • 1
    @FelixDombek Well said sir. Apparently I can forget everything I did in 3 hours. Yeah, the only reason that I need to use `decay_t` is because the [array subscript operator](http://en.cppreference.com/w/cpp/language/operators#Array_subscript_operator) returns a reference. – Jonathan Mee Mar 28 '16 at 18:44
  • @Jonathan Mee: "Strongly typed"? You're adding up an array of integers, why would you "associate a type" with that, other than by having declared your array to be of type int? – jamesqf Mar 28 '16 at 21:34
  • Is there some reason you avoided `accumulate( numbers, numbers + size, 0 )` in the first example? – M.M Mar 28 '16 at 22:56
  • @M.M Yes, to be agnostic of the type of `numbers` elements. If you use a `0` as `accumulate`'s `init` parameter, you have defined the accumulator as an `int`. This will produce potentially incorrect results in the event that `numbers` is a floating point or is `unsigned`. For example, the desired output here would be 11.2, but using the `0` argument results in an output of 10: http://ideone.com/A12xin The only way to make `accumulate` agnostic to the type of `numbers` is to associate the type of it's accumulator. Hence the: `decay_t{}` – Jonathan Mee Mar 29 '16 at 00:39
  • @jamesqf Associating the type of `accumulate`'s `init` argument with the type of `numbers` is essential to getting the right result. [Here is an example](http://ideone.com/A12xin) of what would happen if someone just changed to `double numbers[]` while leaving the argument to `accumulate` as a `0`, you'd want a 11.2, but you'll get a 10. It is extremely unlikely that someone returning to change the type of `numbers` would remember to change the type of the accumulator as well. Associating the type prevents such problems. Hence the: `decay_t{}` – Jonathan Mee Mar 29 '16 at 00:46
  • @Jonathan Mee: But being a minimally competent programmer also prevents that sort of thing. What I'm saying is that instead of messing around with accumulate or whatever, you just do a simple for loop to add them. – jamesqf Mar 29 '16 at 18:42
2
int sum(int *num,int size)
{
int total=0;
                                   /* function to sum integer array */
if (size <= 0) return(ERROR);
while(size--) total+= *num++;
return total;
}

Is faster, more compact, and error tolerant.

Arif Burhan
  • 507
  • 4
  • 12
  • @TomTanner says that "Some compilers, if they spot a tail recursion, they generate faster code than they would for a loop." So when you say your code is "faster" have you timed that? – Jonathan Mee Mar 29 '16 at 11:07
  • I've updated [my answer](http://stackoverflow.com/a/36265411/2642059) with benchmarking, and I'm certain that your "faster" claim is false. I cannot reach statistical difference between any of these solutions without running out of stack space. In other words, the compiler is turning them all into the exact same solution. – Jonathan Mee Mar 29 '16 at 11:58
  • @Jonathan Mee: Could we say "no slower" instead of "faster"? I would expect a good compiler (in Intel architecture, anyway) to effectively parallelize that construct with SSE instructions. Also remember that if we're talking about large arrays, time reading from memory is important, and it's generally faster to do a linear read. And regardless of execution speed, this is one heck of a lot easier to understand than any alternative. – jamesqf Mar 29 '16 at 18:50
1

numbers is a pointer ; on each iteration the function sum() advances through the array (this is what numbers+1 does), at the same time decreasing size by 1 (--size would work just as well).

When size reaches 0 this is the exit condition, and the recursion ends.

Arif Burhan
  • 507
  • 4
  • 12