0

I have written a program for checking array out of bounds for global array and local array.

Case 1:

/* Local and global Aoob demo*/

uint8  problematic_array_global[5] = {22,34,65,44,3};
uint16 adjacent_array_global[3] = {82,83,84};

void aoob_example()
{
/*Global*/
    printf("            Before global Aoob , adjacent_array={%u,%u,%u}\n",adjacent_array_global[0],adjacent_array_global[1],adjacent_array_global[2]);
    
    uint8 idx = 0;
    
    for(;idx < 15;idx++)
    {
        problematic_array_global[idx] = idx;
    
    }
    printf("            After global Aoob  , adjacent_array={%u,%u,%u}\n",adjacent_array_global[0],adjacent_array_global[1],adjacent_array_global[2]);

And got the result:

Before global Aoob , adjacent_array_global={82,83,84}
After global Aoob  , adjacent_array_global={2312,2826,3340}

Case 2:

void aoob_example()
{
    uint8 problematic_array[5] = {1,2,3,4,5};
    uint8 adjacent_array[3] = {12,13,14};
    
    printf("            Before Aoob var=%u , adjacent_array={%u,%u,%u}\n",var,adjacent_array[0],adjacent_array[1],adjacent_array[2]);
    uint8 idx = 0;
    
    for(;idx < 8; idx++)
    {
        problematic_array[idx] = idx;
    
    }
    printf("            After Aoob var =%u , adjacent_array={%u,%u,%u}\n",var,adjacent_array[0],adjacent_array[1],adjacent_array[2]);

And got the result:

Before Aoob var=10 , adjacent_array = {12,13,14}
After Aoob var =10 , adjacent_array = {12,13,14}

Seems with local array we didn't have any array out of bounds although I have declared 2 local arrays nearby and for loop up to 8 iteration.

What is difference between 2 declaration? How is an array stored in memory in this program? What happened here? How to understand this behavior in c?

Nimantha
  • 6,405
  • 6
  • 28
  • 69
ahait
  • 27
  • 5
  • 3
    The only difference is that: different assembly code is generated by the compiler. As the code by itself is not valid - the behavior of your code is not defined, we can't reason on anything that happens nor explain the behavior. You can only inspect the assembly. – KamilCuk Jan 15 '22 at 12:59
  • Re “How is an array stored in memory in this program? What happened here? How to understand this behavior in c?”: There are no standard rules about how compilers arrange things in memory. It is like putting groceries in bags at the grocery store. If you watch people buy the same things at the store and put them in bags and take them home, will those things end up arranged the same way in different people’s homes? No, not generally; each person will put things on their own shelves differently… – Eric Postpischil Jan 15 '22 at 13:11
  • … When you give the compiler a bunch of things in source code, it stores them temporarily in its own metaphorical “shopping bags” as it analyzes the source code, figuring out how much memory it needs and what requirements it must satisfy. After doing that work, it assigns various memory locations to things. Those assignments can be affected by the sizes of things, by their alignment requirements, by arbitrary features of their names (as the compiler may have kept them sorted by name or hashed by name or something else), interactions with other source code features, and more… – Eric Postpischil Jan 15 '22 at 13:13
  • … So generally you cannot expect two arrays declared one after the other to be placed into memory one after the other. You are simply on the wrong track expecting to find an array overrun to affect a particular other array. – Eric Postpischil Jan 15 '22 at 13:14

2 Answers2

0

Out of two nearby arrays, only one will overwrite the other when out of bounds. The order of that adjacency is not defined.

That is one factor. The other is is the compiler optimization, and the warnings. With global arrays and -O1 I get:

 mov    BYTE PTR [rip+0x2ec6],0x0        # 4030 <problematic_array>
 mov    BYTE PTR [rip+0x2ec0],0x1        # 4031 <problematic_array+0x1>
 mov    BYTE PTR [rip+0x2eba],0x2        # 4032 <problematic_array+0x2>
 mov    BYTE PTR [rip+0x2eb4],0x3        # 4033 <problematic_array+0x3>
 mov    BYTE PTR [rip+0x2eae],0x4        # 4034 <problematic_array+0x4>
 mov    BYTE PTR [rip+0x2ea8],0x5        # 4035 <adjacent_array>
 mov    BYTE PTR [rip+0x2ea2],0x6        # 4036 <adjacent_array+0x1>
 mov    BYTE PTR [rip+0x2e9c],0x7        # 4037 <adjacent_array+0x2>

So the compiler knows exactly what is happening, and there is a warning:

warning: iteration 5 invokes undefined behavior

When problematic_array is not writing into adjacent, it writes to whatever else is there, with whatever consequences.

With local arrays, the stack region will be involved (stack smashing detected), but the basic problem is the same. Anything could be overwritten.

Here the compiler is given a chance to warn because of the constant numbers. Normally a ooB is more subtle and not easily predictable be the compiler.

With local arrays

-1

Accessing problematic_array with index 5 is invalid. The array is accessed "out-of-bounds". The array has a size of 5, so you can access elements with indexes 0 to 4.

How to understand this behavior in c?

This class of behavior is called "undefined behavior". The behavior of your code is not defined, no one knows what should happen, anything can happen, nothing can happen. The term "undefined behavior" should be easy to google and even is on wiki - you can learn more from Undefined, unspecified and implementation-defined behavior .


What is difference for local and global array for memory allocation in C language?

The difference is in variable scope (global arrays can be accessed from anywhere) and lifetime (global array are alive for the duration of the program, while local arrays are alive till the end of the block). See https://en.cppreference.com/w/c/language/scope and https://en.cppreference.com/w/c/language/lifetime .

Well, in this specific case, it affects in some way the code generated by the compiler. The only way to find out is to inspect the generated machine code.

What is difference between 2 declaration?

In the first case, variables are defined at file scope and one is uint16.

In the second, both are uint8 and are defined at block scope.

Do not use uint8 uint16 - use standardized uint8_t and uint16_t from standardized stdint.h. https://en.cppreference.com/w/c/types/integer

How is an array stored in memory in this program?

Consecutive array elements are guaranteed to be allocated in consecutive adjacent memory locations.

There is no relation between these two arrays. They are unrelated to each other, except they are defined close in the source code.

What happened here?

Your code is invalid, and the compiler is generating code. In the first case it "happened" that the compiler generated instructions that overwrote the other array, in the second case it did not happened. You can only inspect the generated by the compiler machine code and go line by line to find out what really happens.

KamilCuk
  • 120,984
  • 8
  • 59
  • 111