0

In a school, we had to write a small program, that should look into the first 2 bytes of a file, to check whether it starts like a JPEG file should or not. (It's a UNIX beginner course)

One group offered the following solution (removed error handling):

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>

int main(int argc, char** argv) {

    int fd = open(argv[1], 0);
    unsigned char buffer[2];
    ssize_t readBytes = read(fd, &buffer, 2);

    if(buffer[0] == 0xFF && buffer[1] == 0xD8) {

        printf("The file starts with a JPEG header\r\n");
    } else {

        printf("The file does not start with a JPEG header\r\n");
    }
}

It compiles without any warnings (even with -Wincompatible-pointer-types) and works as it should, but not how I would expect it to.

I've learned, that in C++ at least, an array is actually just a pointer to the first element of the array, and using the array[index] syntax is actually just *(array+index)

So I compiled this version of the program and also one, where I just use buffer, since I expect it to be the array address and thus, what read requires. Turns out both programs look exactly the same when disassembled.

So what's going on here? I have been discussing about this problem with a colleague and we guess it's because the array is actually just some space in stack, which will be statically assigned during compile time, so the compiler will use static addresses wherever buffer[index] is used. So to simulate arrays that were dynamically assigned in heap storage, GCC is using the address for buffer[0] wherever just buffer is used. But why would &buffer be even a legal syntax?

This whole thing really confuses me, and I'd really like to know what is actually happening here.

gsamaras
  • 71,951
  • 46
  • 188
  • 305
Link64
  • 718
  • 4
  • 20
  • 4
    *an array is actually just a pointer to the first element of the array* - No, it is not. – Eugene Sh. Jun 01 '17 at 15:02
  • Though in this case you do notice any difference, the difference will be clear when you would write `&buffer+1` as this would add "one time the size of the array" and would thus point to `buffer[3]` (which does not exist). Has to do with type semantics, as a C guru explained me. (Guru, please explain hime this better.) – Paul Ogilvie Jun 01 '17 at 15:04
  • In C, an array and and a pointer to an array are often mixed up, confused and mistaken as equal. I suggest you read [this post](https://stackoverflow.com/questions/1461432/what-is-array-decaying/1461449#1461449) – Badda Jun 01 '17 at 15:04

1 Answers1

1

I've learned, that in C++ at least, an array is actually just a pointer to the first element of the array

Not at all. An array is a type distinct from a pointer.

The address of an array coincides with the address of its first element, and array implicitly converts (decays) to the pointer to its first element, hence array == &array[0] == array + 0.

See array-to-pointer decay :

There is an implicit conversion from lvalues and rvalues of array type to rvalues of pointer type: it constructs a pointer to the first element of an array. This conversion is used whenever arrays appear in context where arrays are not expected, but pointers are.

Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271