In a school, we had to write a small program, that should look into the first 2 bytes of a file, to check whether it starts like a JPEG file should or not. (It's a UNIX beginner course)
One group offered the following solution (removed error handling):
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
int main(int argc, char** argv) {
int fd = open(argv[1], 0);
unsigned char buffer[2];
ssize_t readBytes = read(fd, &buffer, 2);
if(buffer[0] == 0xFF && buffer[1] == 0xD8) {
printf("The file starts with a JPEG header\r\n");
} else {
printf("The file does not start with a JPEG header\r\n");
}
}
It compiles without any warnings (even with -Wincompatible-pointer-types) and works as it should, but not how I would expect it to.
I've learned, that in C++ at least, an array is actually just a pointer to the first element of the array, and using the array[index]
syntax is actually just *(array+index)
So I compiled this version of the program and also one, where I just use buffer
, since I expect it to be the array address and thus, what read
requires. Turns out both programs look exactly the same when disassembled.
So what's going on here? I have been discussing about this problem with a colleague and we guess it's because the array is actually just some space in stack, which will be statically assigned during compile time, so the compiler will use static addresses wherever buffer[index]
is used. So to simulate arrays that were dynamically assigned in heap storage, GCC is using the address for buffer[0]
wherever just buffer
is used. But why would &buffer
be even a legal syntax?
This whole thing really confuses me, and I'd really like to know what is actually happening here.