How does C compiler know the end of an array?

Question

I've read several answers for this question but can't fully understand. How does the compiler know the end of an array. If we suppose that an array of 4 int is located in memory just before an another int, can we by mistake write array[4] and it will give us this 5th int? I suppose no so how the compilers knows there are only 4 elements?

C doesn't have any kind of bounds-checking, the compiler will not stop you from doing bad things like that. It's your responsibility as a programmer to make sure the code doesn't do it. — Some programmer dude, Sep 02 '22 at 09:16
*"it will give us this 5th int"* - it will attempt to access memory past the bounds of the array, which formally invokes *undefined behavior*. — UnholySheep, Sep 02 '22 at 09:18
This is one feature of C, which is good and bad at the same time. Good because gives better performance than automatic runtime bound checks, and bad because it has no bounds check. Remember that C was and is used on severely limited targets. As a C programmer, you are responsible to regard the bounds. — the busybee, Sep 02 '22 at 09:28
@thebusybee: due to its semantics, C allows address aliasing, making it technically impossible to perform bounds checking in some cases, I guess. — , Sep 02 '22 at 09:31
Are you actually trying to ask _how the compiler knows the size of an array_, or _whether the C language performs bounds checking on array access_? — Useless, Sep 02 '22 at 09:39

JonGreen · Accepted Answer · 2022-09-02T11:08:29.027

If you're lucky, the compiler might spot that you're writing beyond the end of the array, and flag it as a warning, but it's not actually a compile-time error.

If you have this code:

static int a[4];
static int b;
// ...    
a[4] = 42;

You'd actually discover that b now has the value 42, unless the compiler decided to put it somewhere else.

Yes, it's that easy to overrun an array in C. There are no guard rails.

In fact, this behaviour is explicitly relied upon in some places, although it's not recommended any more. You might declare a struct as follows:

struct comms_block {
    enum comms_block_type block_type;
    size_t len;
    uint8_t variable_data[1];
};

And then, when you wanted to create a comms block of type t, with variable data length len, you would use a function like this:

struct comms_block *new_comms_block(enum_comms_block_type t, size_t len) 
{
    struct comms_block *b = malloc(sizeof(*b) + len - 1);
    b->block_type = t;
    b->len = len;
    return b;
}

The function returns a struct comms_block with len bytes of space from variable_data[0] onwards.

You can safely index variable_data[] using any value up to (len - 1) despite that it's only declared as a single-byte array in the struct definition.

[Do I cast the result of malloc?](https://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc) — vmemmap, Sep 02 '22 at 10:36
*You might declare a `struct` as follows ...* That [`struct` hack](https://c-faq.com/struct/structhack.html) has officially been [undefined behavior since 1993](https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_051.html) and has been [outdated and replaced by flexible array members since 1999](http://port70.net/~nsz/c/c99/n1256.html#6.7.2.1p16). — Andrew Henle, Sep 02 '22 at 10:47
I didn't say it was recommended behaviour! I was illustrating how C doesn't enforce indexing. — JonGreen, Sep 02 '22 at 11:03

score 1 · Answer 2 · answered Sep 02 '22 at 09:24

1

In the context where the array is defined, the bounds are specified and the compiler knows the length. A sizeof is possible.

In the contexts where the array is passed as an argument, only the starting address is given and the compiler does not know the length at all.

This is a terrible source of weird bugs by buffer overflow.

In some cases, static analysis could let a compiler warn about obvious buffer overflows, but not always.

answered Sep 02 '22 at 09:24

So, is it possible to print an int not included in the array, if it is located after an int array? I mean, if we do array[4] by mistake? (just to fully understand the situation) – Maria Sep 02 '22 at 09:34
2

@Maria: this is called an undefined behavior. Depending on the implementations it is "possible" or "impossible", whatever that means. – Sep 02 '22 at 09:37

score 1 · Answer 3 · answered Sep 02 '22 at 09:25

Compilers read and interpret the source code (where the array variable is dimensioned to have 4 elements.) Modern compilers (and add-ons) can analyse the source code (as the programmer should) and, through that evaluation, determine if "rules are being broken"...

char a[4]; // set aside 4 bytes (uninitialised)

char a[] = { 'a', 'b', 'c', 'd' }; // set aside 4 bytes (initialised)
// Above is NOT a string!

char a[] = "abc"; // 3 + 1 bytes initialised
// Above IS a string (null terminated array of chars.

The compiler "sees" this and "knows" how big 'a[]' is.

score -1 · Answer 4 · answered Sep 02 '22 at 15:37

-1

**

char a[4]={'w', 'x', 'y', 'z'};//here index of w is 0 and the index of last element in 3 ... // so you may have question like what is stored in a[4] ... It's nothing but '\0' it means null character.. // compiler will understand that it is the end of character array..

**

Hope you got what you asked

answered Sep 02 '22 at 15:37

MrKlu

1

As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Sep 08 '22 at 00:44

How does C compiler know the end of an array?

4 Answers4