1

I ran the following C code on Linux (Ubuntu-22.04 x86-64):

#include <malloc.h>
#include <unistd.h>

int main() {
  char* s = malloc(8 * 1024 * sizeof(char));
  for (int i = 0; i < 10 * 1024; ++i) {  // out-of-bound access
    s[i] = i; // Write to s
  }
  usleep(1);  // To bring possible access to s, or the compiler may optimize out s
  free(s);    // Crash
}

But the program doesn't crash at the assignment s[i] = i but crash at free(s):

double free or corruption (!prev)

However, if I read from s instead of write to s, no error will occur:

#include <malloc.h>
#include <stdio.h>
#include <unistd.h>

int main() {
  char* s = malloc(8 * 1024 * sizeof(char));
  for (int i = 0; i < 10 * 1024; ++i) {
    printf("%d\n", (int)s[i]); // Read from s
  }
  usleep(1);
  free(s);    // No errors
}

Furthermore, on Windows it crashes just at the assignment s[i] = i;, which is much more easier to understand (page fault).

Then how Linux implement free? What does the program do inside the function free?

Diff view on compiler explorer:

https://gcc.godbolt.org/z/1evavP8sW

Timothy Liu
  • 211
  • 1
  • 7
  • https://stackoverflow.com/questions/851958/where-do-malloc-and-free-store-allocated-sizes-and-addresses could also be interresting for you – Support Ukraine Jan 10 '23 at 10:44
  • 5
    It doesn't check for out-of-bounds access at all. What it does do is use memory immediately before and after the allocated block to store info about the heap -- free and in-use blocks, their sizes, and such. So when you write off the end of an allocated block, that gets corrupted and free aborts when it finds inconsistent or invalid data in the heap's internal data structures – Chris Dodd Jan 10 '23 at 10:45
  • Ok so it's just a boring array out of bounds bug. When writing out of bounds you could corrupt the heap, corrupt something else, or if lucky the OS/MMU comes to smack you on the fingers. It is _undefined behavior_. [How dangerous is it to access an array out of bounds?](https://stackoverflow.com/questions/15646973/how-dangerous-is-it-to-access-an-array-out-of-bounds) – Lundin Jan 10 '23 at 11:08
  • Ok that seems right. Then the out-of-bound access in this example doesn't actually access invalid memories. It just modifies some information for heap memory management, right? – Timothy Liu Jan 10 '23 at 11:32
  • @Timothy At C-level it is an invalid access. At HW level it happened to be a legal access on your system. On other systems it could also have been an illegal access at HW level. In any case... due to the C-level invalid access, your program has undefined behavior so anything may happen (including appearing to work like your second example) – Support Ukraine Jan 10 '23 at 11:43
  • OT: Try to change `10 * 1024;` to `1000 * 1024;` Maybe that will give another crash – Support Ukraine Jan 10 '23 at 11:51
  • Think of it like this: `free` checks to make sure you haven't done anything wrong, in more or less the same way that you check to see that no one has dropped a grand piano on you as you walk down the sidewalk. You *don't* check (well, maybe some extremely paranoid people do look up every few seconds), so if the unthinkable happens, and someone *does* drop a grand piano, it's catastrophic and completely unpredictable. And `free` doesn't check either: it just goes about its business, accessing its hidden data structures to decide what to do, and if those have gotten corrupted, *Crash!* – Steve Summit Jan 10 '23 at 13:02

1 Answers1

2

free does not check whether your program access data out of bounds. It checks the data structures that the memory management routines (malloc, realloc, free, and related routines) use to keep track of memory allocations and available memory. When it finds evidence those data structures have been corrupted, it reports an error.

When you read outside of array bounds, it did not corrupt those data structures, so free did not observe any problems. When you wrote outside of array bounds, it corrupted those data structures, so free observed problems.

But the program doesn't crash at the assignment s[i] = i

This is effectively happenstance of how memory happened to be arranged. General-purpose multi-user systems use hardware features to map memory and to protect processes from interfering with each other. In the Linux case you attempted, your out-of-bounds array accesses happened to go into memory that was mapped for your process, with both read and write permissions. In the Windows case you attempted, your out-of-bounds array access happened to go into memory that was not mapped with write permission for your process, so the hardware signaled a fault. The behavior you observed on Linux can also happen on Windows, with different combinations of memory locations and array indices, and the behavior you observed on Windows can also occur on Linux.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312