I have
x=(int *)malloc(sizeof(int)*(1));
but still I am able to read x[20]
or x[4]
.
How am I able to access those values? Shouldn't I be getting segmentation error while accessing those memory?
I have
x=(int *)malloc(sizeof(int)*(1));
but still I am able to read x[20]
or x[4]
.
How am I able to access those values? Shouldn't I be getting segmentation error while accessing those memory?
The basic premise is that of Sourav Ghosh's answer: accessing memory returned from malloc beyond the size you asked for is undefined behavior, so a conforming implementation is allowed to do pretty much anything, including happily returning bizarre values.
But given a "normal" implementation on mainstream operating systems on "normal" machines (gcc/MSVC/clang, Linux/Windows/macOS, x86/ARM) why do you sometimes get segmentation faults (or access violations), and sometimes not?
Pretty much every "regular" C implementation doesn't perform any kind of memory check when reading/writing through pointers1; these loads/stores get generally translated straight to the corresponding machine code, which accesses the memory at a given location without much regard for the size of the "abstract C machine" objects.
However, on these machines the CPU doesn't straight access the physical memory (RAM) of the PC, but a translation layer (MMU) is introduced2; whenever your program tries to access an address, the MMU checks to see whether anything has been mapped there, and if your process has permissions to write over there. In case any of those checks fail3, you get a segmentation fault and your process gets killed. This is why uninitialized and NULL
pointer values generally give nice segfaults: some memory at the beginning of the virtual address space is reserved unmapped just to spot NULL
dereferences, and in general if you throw a dart at random into a 32 bit address space (or even better, a 64 bit one) you are most likely to find zones of memory that have never been mapped to anything.
As good as it is, the MMU cannot catch all your memory errors for several reasons.
First of all, the granularity of memory mappings is quite coarse compared to most "run of the mill" allocations; on PCs memory pages (the smallest unit of memory that can be mapped and have protection attributes) are generally 4 KB in size. There is of course a tradeoff here: very small pages would require a lot of memory themselves (as there's a target physical address plus protection attributes associated to each page, and those have to be stored somewhere) and slow down the MMU operation3. So, if you access memory out of "logical" boundaries but still within the same memory page, the MMU cannot help you: as far as the hardware is concerned, you are still accessing valid memory.
Besides, even if you go outside of the last page of your allocation, it may be that the page that follows is "valid" as far as the hardware is concerned; indeed, this is pretty common for memory you get from the so-called heap (malloc
& friends).
This comes from the fact that malloc
, for smaller allocations, doesn't ask the OS for "new" blocks of memory (which in theory may be allocated keeping a guard page at both ends); instead, the allocator in the C runtime asks the OS for memory in big sequential chunks, and logically partitions them in smaller zones (usually kept in linked lists of some kind), which are handed out on malloc
and returned back by free
.
Now, when in your program you step outside the boundaries of the requested memory, you probably don't get any error as:
the memory chunk you are using isn't near a page boundary, so your out-of-bounds read doesn't trigger an access violation;
even if it was at the end of a page, the page that follows is still mapped, as it still belongs to the heap; it may either be memory that has been given to some other code of your process (so you are reading data of some unrelated part of your code), or a free memory zone (so you are reading whatever garbage happened to be left by the previous owner of the block when it free
d it), or a zone used by the allocator to keep its bookkeping data (so you are reading parts of such data).
In all these cases except for the "free block" one, even if you were to write there you wouldn't get a segmentation fault, but you could corrupt unrelated data or the data structures of the heap (which generally results in crashes later, as the allocator finds inconsistencies in its data).
Notes
No, accessing invalid memory is undefined behavior, and segmantation fault is one of the many side effects of UB. It is not guaranteed.
That said,
malloc()
by checking the returned pointer against NULL
before using the returned pointer.