1

I have

x=(int *)malloc(sizeof(int)*(1));

but still I am able to read x[20] or x[4].

How am I able to access those values? Shouldn't I be getting segmentation error while accessing those memory?

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
v1234
  • 29
  • 4
  • 2
    Don't expect anything predictable when your program has undefined behavior. – R Sahu Jan 07 '19 at 06:27
  • There is no memory fencing in userspace. What you are accessing is memory allocated by some other userspace program. It will not cause your program to crash directly, but you will get unexpected results. – Soumya Kanti Jan 07 '19 at 06:29
  • 1
    @SoumyaKanti that's not how any modern operating system works... Different userspace programs live in different virtual address spaces, so he's definitely not accessing memory from another userspace program. More probably he's just accessing the next (possibly allocated, possibly not) block of his process' heap (or data belonging to the heap data structures). – Matteo Italia Jan 07 '19 at 06:33
  • Well, let's rephrase this - by some other part of the program. – Soumya Kanti Jan 07 '19 at 06:34
  • @SoumyaKanti yes, let's rephrase this as it's a completely different thing. Also, it's not even necessarily allocated/used memory. – Matteo Italia Jan 07 '19 at 06:35
  • The thing is, when someone is learning a programming language it is not essential to know how the underlying OS behaves - I did not mean that one does not need to know, but that is an advanced stage. The first answer to this question may not be absolutely correct, or always correct - but will give the idea what may happen under the hood. Having said that, I actually studied OSes which do not do the memory fencing and leave it to the developer to handle that. – Soumya Kanti Jan 07 '19 at 06:43
  • 1
    As a practical explanation of why it often will not fault, most *current, mainstream* hardware architectures can only enforce access permission on an entire `page` of memory. If a process has write access to any part of a page, then it has write access to the *entire* page, and a page may be something like 4096 bytes so the example given may not cross a page boundary. But that is the *computer* - as far as what the C standard says, what you are doing is undefined and a unicorn galloping out of your monitor is no more illegitimate than any other result. – Chris Stratton Jan 07 '19 at 07:02
  • 3
    @SoumyaKanti: the point is, if you are going to give details, you have to give _correct_ details, or not give them at all; I've seen plenty of people with strange confusion in their minds due to "white lies" being "helpfully" handed out. Good explanations for beginners are ones that may simplify on difficult aspects, but remain perfectly correct even in the eyes of a professional. As for the OSes with no memory fencing between processes, that's extremely old or specialized stuff, most probably not what OP is using given that he's talking about segfaults. – Matteo Italia Jan 07 '19 at 07:07

2 Answers2

3

The basic premise is that of Sourav Ghosh's answer: accessing memory returned from malloc beyond the size you asked for is undefined behavior, so a conforming implementation is allowed to do pretty much anything, including happily returning bizarre values.

But given a "normal" implementation on mainstream operating systems on "normal" machines (gcc/MSVC/clang, Linux/Windows/macOS, x86/ARM) why do you sometimes get segmentation faults (or access violations), and sometimes not?

Pretty much every "regular" C implementation doesn't perform any kind of memory check when reading/writing through pointers1; these loads/stores get generally translated straight to the corresponding machine code, which accesses the memory at a given location without much regard for the size of the "abstract C machine" objects.

However, on these machines the CPU doesn't straight access the physical memory (RAM) of the PC, but a translation layer (MMU) is introduced2; whenever your program tries to access an address, the MMU checks to see whether anything has been mapped there, and if your process has permissions to write over there. In case any of those checks fail3, you get a segmentation fault and your process gets killed. This is why uninitialized and NULL pointer values generally give nice segfaults: some memory at the beginning of the virtual address space is reserved unmapped just to spot NULL dereferences, and in general if you throw a dart at random into a 32 bit address space (or even better, a 64 bit one) you are most likely to find zones of memory that have never been mapped to anything.

As good as it is, the MMU cannot catch all your memory errors for several reasons.

First of all, the granularity of memory mappings is quite coarse compared to most "run of the mill" allocations; on PCs memory pages (the smallest unit of memory that can be mapped and have protection attributes) are generally 4 KB in size. There is of course a tradeoff here: very small pages would require a lot of memory themselves (as there's a target physical address plus protection attributes associated to each page, and those have to be stored somewhere) and slow down the MMU operation3. So, if you access memory out of "logical" boundaries but still within the same memory page, the MMU cannot help you: as far as the hardware is concerned, you are still accessing valid memory.

Besides, even if you go outside of the last page of your allocation, it may be that the page that follows is "valid" as far as the hardware is concerned; indeed, this is pretty common for memory you get from the so-called heap (malloc & friends).

This comes from the fact that malloc, for smaller allocations, doesn't ask the OS for "new" blocks of memory (which in theory may be allocated keeping a guard page at both ends); instead, the allocator in the C runtime asks the OS for memory in big sequential chunks, and logically partitions them in smaller zones (usually kept in linked lists of some kind), which are handed out on malloc and returned back by free.

Now, when in your program you step outside the boundaries of the requested memory, you probably don't get any error as:

  • the memory chunk you are using isn't near a page boundary, so your out-of-bounds read doesn't trigger an access violation;

  • even if it was at the end of a page, the page that follows is still mapped, as it still belongs to the heap; it may either be memory that has been given to some other code of your process (so you are reading data of some unrelated part of your code), or a free memory zone (so you are reading whatever garbage happened to be left by the previous owner of the block when it freed it), or a zone used by the allocator to keep its bookkeping data (so you are reading parts of such data).

    In all these cases except for the "free block" one, even if you were to write there you wouldn't get a segmentation fault, but you could corrupt unrelated data or the data structures of the heap (which generally results in crashes later, as the allocator finds inconsistencies in its data).


Notes

  1. Although modern compilers provide special instrumented builds to trap some of these errors; gcc and clang, in particular, provide the so-called "address sanitizer".
  2. This allows to introduce transparent paging (swapping out to disk memory zones that aren't actively used in case of low physical memory availability) and, most importantly, memory protection and address space separation (when a user-mode process is running, it "sees" a full virtual address space containing only his stuff, and nothing from the other processes or the kernel).
  3. And it's not a failure put there on purpose by the operating system to be notified that the processes is trying to access memory that has been swapped out.
  4. Given that each access to memory needs to go through the MMU, the mapping must be very fast, so the most used page mappings are kept in a cache; if you make the pages very small and the cache can hold just as many entries, you effectively have a smaller memory range covered by the cache.
Community
  • 1
  • 1
Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
0

No, accessing invalid memory is undefined behavior, and segmantation fault is one of the many side effects of UB. It is not guaranteed.

That said,

  • Always check for the success of the malloc() by checking the returned pointer against NULL before using the returned pointer.
  • Please see this: Do I cast the result of malloc?
Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261