3

I've written this code and "it works" — but how?

I haven't declared an array in my main function; I simply added a standard integer into a procedure with a pointer. For example, it'll scan an array of 2 or 20 integers by changing how many times the loop will run.

#include <stdio.h>

void test(int *v, int n) {
    int i;
    for (i = 0; i < n; i++) {
        printf("[%d]: ", i);
        scanf("%d", &v[i]); 
    }
    printf("\n\n#############\n\n");
    for (i = 0; i < n; i++) {
        printf("[%d]: %d\n", i, v[i]);
    }
}

int main(void) {
    int array;
    int t = 10; 
    test(&array, t);
    return 0;
}

I didn't really know exactly what I was writing, until I realized it was working. I've tried to search about "array of pointers", or pointers in general but couldn't find any specific answer to this example above. I wish I knew more what exactly to look for.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Gazob
  • 39
  • 2
  • 4
    If this code works, it does so by accident. An int is not an array and treating it as such is undefined behavior. – Retired Ninja Apr 13 '18 at 03:38
  • You are printing the values for the memory used to store the int array (probably 8 bytes = 64bits) and the 2bytes of memory beyond this. Since your program owns this memory it is perfectly ok to read it – Martin Beckett Apr 13 '18 at 03:40
  • Please read https://stackoverflow.com/questions/1641957/is-an-array-name-a-pointer – Ricardo González Apr 13 '18 at 03:41
  • 4
    @MartinBeckett No, it's not perfectly ok to access memory outside of uninitialized object (in the C meaning of the word, not OOP object) even if the memory is accessible to the program, and neither it is ok to increment pointer to point to outside of an object. Undefined behavior... Might work as you expect (and on a PC operating system probably will), but might not (for example optimizer might do funny assumptions which make UB-containing code break). C is not assembler. – hyde Apr 13 '18 at 03:55
  • 4
    @RetiredNinja a pointer to an `int` classifies as address of the first element of array of size 1 (the standard says so). So, one can treat it as an array, it is the out of bound access that is the issue. – Ajay Brahmakshatriya Apr 13 '18 at 04:08
  • 3
    Just an *incredibly* minor (almost nit-picking) clarification to @hyde's otherwise-perfect comment, it's okay to *calculate* one byte beyond the array but not de-reference it. But, on the final sentence, I once saw a sticker "C combines all the speed of assembly language with all the readability of, well, assembly language" :-) – paxdiablo Apr 13 '18 at 04:09
  • 1
    It "works" because C is an extraordinary flexible language, close to the machine & CPU, and its types are also very "natural", close to the CPU registries and memory. – Déjà vu Apr 13 '18 at 05:57
  • @hyde - perhaps "possible" is better. It is ok to exit your program by blasting the cpu with a shotgun - but perhaps not best practice – Martin Beckett Apr 13 '18 at 16:06

2 Answers2

11

This code works, for only for some values of the word "work" :-)

Your test function wants the address of an integer and a integer count. The main function gives it exactly those things, &array and t. So it compiles just fine.

However, the instant you try to de-reference v[1] or even calculate the address of v[N] where N is neither zero nor one, all bets are off. You have given an array of exactly one element and it's undefined behaviour to do either of those mentioned things

So, while your code may seem to work(a), that will be entirely by accident and not guaranteed on another implementation, another machine, or even during a different phase of the moon.

You can, of course, fix this by ensuring you don't try to do anything beyond the end of the array. with something like:

int array;
test(&array, 1);

or:

int array[42];
test(array, sizeof(array) / sizeof(*array));

This will ensure the count values passed through to the function matches the size of said array.


For the language lawyers amongst us, C11 Appendix J.2 lists these items as being undefined behaviour (the first covers calculating v[N] where N is neither zero nor one, the second covers de-referencing v[1]):

Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that does not point into, or just beyond, the same array object (6.5.6).

Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that points just beyond the array object and is used as the operand of a unary * operator that is evaluated (6.5.6).

That referenced 6.5.6 /8 states:

If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.


(a) The possibility that undefined behaviour sometimes "works" is actually its most insidious feature. If it never worked, it would be so much better.

That's why I have a patent pending(b) on a device that connects electrodes to your private parts that get activated whenever the compiler detects use of said behaviour.

It has increased considerably the quality of the code our shop delivers. Now if only we could retain our staff, that would be excellent :-)


(b) Not really, I seem to recall there's an actual legal issue with claiming patent pending status untruthfully so I need to clarify that this was for humour value only.

Community
  • 1
  • 1
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
1

Your code works by "happy accident". If you pick a large enough number, for example, it should segfault. Why? Let's see...

In main(), you declare an integer on the stack:

int array;

On the stack, you've placed 4 bytes,[1] and called it array. Then, you pass the address of that to test:

test(&array, ...);

For simplicity of explanation, since array is the first variable in your program's stack, we'll call it address 0. By the time you get to test()'s first code line (for ...), then, your stack looks something like:

Address   | Variable name   | Value
 16       |  i              |  ???
 12       |  n              |  10
  8       |  v              |  --> 0  (Pointer to 0)
  4       |  t              |  10
  0       |  array          |  ???

Then, during the first iteration of the first for-loop, the code puts integers somewhere with:

scanf("%d", &v[i]);

What address is &v[i]? Let's break down that syntax first. That literally reads

  1. deference v (v[]) to get the memory location of interest,
  2. get the i'th item (meaning look at the memory location that is i * sizeof( *v ) away from *v),
  3. and finally give the address of that (&) to scanf.

Terse syntax, eh? So, when i is

  • 0, this would pass &v[0], or &(*(v + 0)), or address 0 to scanf.
  • 1, this would pass &v[1], or &(*(v + 1)), or address 4 to scanf.
  • 2, this would pass &v[2], or &(*(v + 2)), or address 8 to scanf.
  • ...

At which point you begin to see that you haven't created an array at all, but rather are making ground-beef out of your program's innards. Yum!

Finally what really helped me grok this detail awhile back was seeing the memory addresses on my machine. You might consider printing out the memory addresses of all of your variables. For example, you might annotate main() with:

int main(void) {
    int array;
    int t = 10;

    // use %p inplace of %lu if the warnings are an issue; I'm not
    // fluent in hex, which is what both of my C compilers spit out
    printf("Value of 'array':   %d\n", array);
    printf("Address of 'array': %lu\n", &array);
    printf("Address of 't':     %lu\n", &t);
    test(&array, t);

    return 0;
}

[1] The pedantically minded will correctly complain that 4 bytes is arbitrary and certainly not guaranteed across all architectures. This post is in response to a self-described noob, so there are some intentional pedagogical tradeoffs.

hunteke
  • 3,648
  • 1
  • 7
  • 17