0

I have the following code:

#include <iostream>
#include <stdlib.h>
#include <stdio.h>

int main() {
    int data = 0;
    char *byte = (char *)malloc(sizeof(char)*1000000000);
byte[999999999] = 'a';
printf("%c",byte[999999999]);
    scanf("%d",&data);
    return 0;
}

Looking at memory before the program starts and before the scanf I would expect the memory to increase of 1 GB. Why isn't this happening?

Edit: I added

byte[999999999] = 'a';
printf("%c",byte[999999999]);

The program outputs a.

HAL9000
  • 3,562
  • 3
  • 25
  • 47
  • 3
    presumably your OS overcommits memory. Try writing to the allocated block and there's a chance it will fail. –  Feb 05 '14 at 12:42
  • 9
    And please, please, **please** don't mix C and C++ so badly, and [don't cast the return value of `malloc()`](http://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc/605858#605858)! –  Feb 05 '14 at 12:43
  • 1
    @H2CO3 if I remove cast it simply doesn't compile in C++. – HAL9000 Feb 05 '14 at 12:47
  • 4
    @HAL9000 You need the cast in C++. You normally don't use it in C. Btw, `sizeof(char)` is always 1 in C. – Juri Robl Feb 05 '14 at 12:48
  • 4
    @HAL9000 That's correct. This is why you don't use `malloc()` in C++. (There are other reasons too.) The code you have is not C++, it's C, so there's no reason to try compiling it with a C++ compiler. They are different languages. You don't often try to run Python code in a JavaScript interpreter, do you. –  Feb 05 '14 at 12:48
  • @H2CO3 Actually my C++ compiler also compiles C code and I find no reasons to not doing this. Moreover Advanced Linux Programming taught me to use cast when using malloc. – HAL9000 Feb 05 '14 at 13:00
  • 2
    @HAL9000 "Actually my C++ compiler also compiles C code" - either you think that C++ compilers compile C code (which is wrong), or your compiler has a true C mode, in which case you should be using that mode, and in which mode it **will** compile without the cast. The "advanced whatever taught me to cast" argument is downright bullshit. Did you even read the answer I linked to? It's clearly described there why it is **wrong** to include the cast. –  Feb 05 '14 at 13:02
  • @H2CO3 I read that answer so many times and every time I don't find enaugh argumentation to not using it. It can be judged USELESS but not wrong, moreover C++ compiler needs it. – HAL9000 Feb 05 '14 at 13:05
  • 2
    @HAL9000 That's why you don't use a C++ compiler to compile C code. C is **not** a subset of C++. – Juri Robl Feb 05 '14 at 13:06
  • 2
    @HAL9000: Historically, the cast was a bad idea: if you forgot to include ``, then calling `malloc` would implicitly declare a function returning `int`. Without a cast, this would fail to compile; with one, it would misinterpret the return value as `int` and cast that to a pointer, leading to obscure bugs. (In modern C, without implicit declarations, the cast is merely unnecessary and ugly). – Mike Seymour Feb 05 '14 at 13:07
  • 1
    @HAL9000 So, for one, "the cast can hide errors" does not make it wrong, okay. The decreased readability doesn't make it wrong, okay. Now tell me, why would you ever use `malloc()` in C++? –  Feb 05 '14 at 13:07
  • @H2CO3 if you write some C code which does some function and in future you want to include it in C++ you use malloc. The point is not "why to use it", the point is "why to not use it", and the possibility of hiding errors is not a problem for my limited personal purposes. Industrially it could be a problem, for me it's not. – HAL9000 Feb 05 '14 at 13:12
  • 4
    @HAL9000 If you're a beginner and you ask help from people who are experienced, it's generally a good idea for you to be willing to learn. In this case, for instance, decide what you're writing (C *or* C++), then follow the appropriate practices and then build using the appropriate compiler. – Theodoros Chatzigiannakis Feb 05 '14 at 13:14
  • @TheodorosChatzigiannakis learning does't mean to follow ALL not universally accepted theories. I'm here to learn and when there is something to learn (as in the accepted answer) I'm happy to learn. I won't take as Holy Bible practices which are not universally considered good or necessary and which I don't consider bad. – HAL9000 Feb 05 '14 at 13:45
  • 3
    @HAL9000 There is no single practice that's universally considered good, because there will always be people who are clueless or simply don't care enough to follow even the most obvious advice. One very obvious kind of advice is compiling code written in *one* language with a compiler *of that language* - not with a compiler of another language that superficially resembles the one of the source code. – Theodoros Chatzigiannakis Feb 05 '14 at 13:49
  • 2
    @HAL9000 Generally, because the world of software is a vast ecosystem, pieces of software need to work together and the balance becomes somewhat delicate, it's always good to at least give the benefit of the doubt to people who seem to be more experienced than you. Anyone who has tutored people (especially in computer programming) will tell you that people who think that they're too clever to learn, never learn. Especially if you're a beginner, I think it'd be better for you in the long run to be willing to learn and when you're more experienced, you can stop applying whatever you don't like. – Theodoros Chatzigiannakis Feb 05 '14 at 13:54
  • @TheodorosChatzigiannakis who defines the amount of experience necessary to stop applying a practice that one applies only becasue someone thinks it's better? I've been programming in C for 13 years and in C++ for 2 years and I never had problem using cast in malloc or malloc in C++. It's ugly? Not for me. I simply prefer to see a useless cast instead of one less. When my company will tell to me to not use it I will stop using it. It's a fruitless speech, to me. Things to learn are MANY and OTHERS. – HAL9000 Feb 05 '14 at 14:08

3 Answers3

8

By default, Linux allocates physical memory lazily, the first time it's accessed. Your call to malloc will allocate a large block of virtual memory, but there are not yet any pages of physical memory mapped to it. The first access to an unmapped page will cause a fault, which the kernel will handle by allocating and mapping one or more pages of physical memory.

To allocate all the physical memory, you'll have to access at least one byte on every page; or you could bypass malloc and go straight to the operating system with something like

mmap(0, 1000000000, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_POPULATE, -1, 0);

Note the use of MAP_POPULATE to populate the page tables, i.e. to allocate physical memory, immediately. According to the manpage, this only works on fairly recent versions of Linux.

Mike Seymour
  • 249,747
  • 28
  • 448
  • 644
5

Most (all?) Linux systems overcommit memory. Which means a malloc never fails, even if you reserve ridiculous amounts of memory if you use a specific commit strategy. It can fail on some occasions if you reserve way too much memory, for example more than the virtual memory.

You don't get real memory adresses, only a pointer which is promised to point to enough memory to use for you. The system allocates the memory for you if you try to use it, so you have to write to it to signal the system that you really want it.

That system is used because many programs don't need the memory they reserve or need it at another time.

After your Edit you access only one page, so you only get one page reserved. You need to access every page to get all the memory.

Juri Robl
  • 5,614
  • 2
  • 29
  • 46
  • 1
    Are you sure? `malloc` never fails? If it cannot promise you memory, I think it does fail. Also it has to reserve the memory, what would happen if it only promised over 50% of it to two processes, but then both of them actually used it? OS, does not "hog" physical memory upon `malloc` call. But it can fail and it does occasionally. – luk32 Feb 05 '14 at 12:49
  • 1
    It never fails on those systems. If there is not enough memory the system starts to kill other processes to free memory. (OOM Killer) Obviously it depends on your configuration, on windows it will fail if it can't promise the memory for example. – Juri Robl Feb 05 '14 at 12:51
  • 1
    It think when you allocate and commit to the whole virtual memory on a system, then call `malloc` it will fail. Is that false? I am curious, and do not have any machine I can "break" under my hand. I would appreciate if you know the answer off the bat, because I saw `malloc`s failing, maybe it was too long ago. Lets say `malloc` something in a loop and write to it. It will always pass and system will start killing processes or `malloc` will start returning `0`? – luk32 Feb 05 '14 at 12:56
  • I think you are right, it shouldn't succeed if there is no virtual memory left. But that should depend on the commit startegy used. Either it starts to kill processes until yours gets killed, or it returns 0 at some point. – Juri Robl Feb 05 '14 at 13:00
  • 1
    I saw it only once when `malloc` actually returned `0`, I think. It was a bug that executed similar loop. And I fully agree that it is very hard to make it fail, and it is extreme borderline case. But it possible. I thought it was a bit easier, though. I thought that trying to reserve more than whole virtual memory would also fail, on linux too. – luk32 Feb 05 '14 at 13:05
  • 1
    I just had malloc(24ull*1024ull*1024ull*1024ull) return a null pointer on my system (16GiB ram, 16GiB swap) – PlasmaHH Feb 05 '14 at 13:18
  • 1
    And what about `calloc(SIZE_MAX, SIZE_MAX)`? – pablo1977 Feb 05 '14 at 14:00
  • Sorry I have no idea how that works in calloc. If there is a system call used to get zeroed memory, it's probably the same. – Juri Robl Feb 05 '14 at 14:04
0

malloc didn't gave fetch you all the 1000000000 when you ask. Its just 1st page you get until you start access it (read/write), you'll get assigned rest of it.

Linux, malloc requests will expand virtual address space immediately but doesn't assign physical memory pages until you access it.

Sunil Bojanapally
  • 12,528
  • 4
  • 33
  • 46