4

I know when a program first starts, it has massive page faults in the beginning since the code is not in memory, and thus need to load code from disk.

What happens when a program exits? Does the binary stay in memory? Would subsequent invocations of the program find that the code is already in memory and thus not have page faults (assuming nothing runs in between and pages stuff out to disk)?

It seems like the answer is no from running some experiments on my Linux machine. I ran some program over and over again, and observed the same number of page faults every time. It's a relatively quiet machine so I doubt stuff is getting paged out in between invocations. So, why is that? Why doesn't executable get to stay in memory?

user3240688
  • 1,188
  • 3
  • 13
  • 34
  • You need to check your system's `caching` capability. I highly suspect what you're saying is not correct! Maybe there is an alteration in system settings. – Am_I_Helpful Apr 18 '15 at 04:09

4 Answers4

3

There are two things to consider here:

1) The content of the executable file is likely kept in the OS cache (disk cache). While that data is still in the OS cache, every read for that data will hit the cache and the OS will honor the request without needing to re-read the file from disk

2) When a process exits, the OS unmaps every memory page mapped to a file, frees any memory (in general, releases every resource allocated by the process, including other resources, such as sockets, and so on). Strictly speaking, the physical memory may be zeroed, but not quite required (still, the security level of the OS may require to zero a page that is not used anymore - probably Windows NT, 2K, XP, etc, do that - see this Does Windows clear memory pages?). Another invocation of the same executable will create a brand new process which will map the same file in the memory, but the first access to those pages will still trigger page faults because, in the end, it is a new process, a different memory mapping. So yes, the page faults occur, but they are a lot cheaper for the second instance of the same executable compared to the first.

Of course, this is only about the read-only parts of the executable (the segments/modules containing the code and read-only data).

One may consider another scenario: forking. In this case, every page is marked as copy-on-write. When the first write occurs on each memory page, a hardware exception is triggered and intercepted by the OS memory manager. The OS determines if the page in question is allowed to be written (eg: if it is the stack, heap or any writable page in general) and if so, it allocates memory and copies the original content before allowing the process to modify the page - in order to preserve the original data in the other process. And yes, there is still another case - shared memory, where the exact physical memory is mapped to two or more processes. In this case, the copy-on-write flag is, of course, not set on the memory pages.

Hope this clarifies what is going on with the memory pages.

Community
  • 1
  • 1
botismarius
  • 2,977
  • 2
  • 30
  • 29
  • 1
    "probably Windows NT, 2K, XP, etc, do that" No, they don't. Why do you expect Windows to do it and not other OSs? – m0skit0 Apr 18 '15 at 21:49
  • 1
    Maybe add a bit about major vs minor page faults. – Zan Lynx Apr 18 '15 at 23:09
  • @m0skit0: Yes, Windows does zero pages before allocating them to a different process. However that isn't particularly relevant to disk caching. – Ben Voigt Apr 18 '15 at 23:52
  • @botismarius - "So yes, the page faults occur, but they are a lot cheaper for the second instance of the same executable compared to the first." Why exactly is that? Is it because it'd find the code in disk cache, and thus does not require a fetch from disk? This is a minor page fault, right? Is it correct to say that the second instance suffers from minor page fault while first instance suffers from major page fault? – user3240688 Apr 19 '15 at 01:32
  • @BenVoigt Thanks for the insight (if anyone is interested, [here is the reference](https://msdn.microsoft.com/en-us/library/windows/desktop/aa366887%28v=vs.85%29.aspx)) – m0skit0 Apr 19 '15 at 01:46
  • @user3240688 yes, the data (in this case, the code) is in the OS disk cache. I've not heard this terminology before (minor vs major page fault), but yes, the second is a lot cheaper. Still, I expect not very very cheap - considering that a user to kernel mode transition and back is required. – botismarius Apr 19 '15 at 07:24
  • @m0skit0 - I'm sure other OS do that too. It's just that I've read a couple of years ago this detail about Windows NT in particular. Just think what a process without any privilege could do by simply allocating memory and obtaining the same physical memory that was previously allocated to a sensible process - it could obtain some sensible credentials. – botismarius Apr 19 '15 at 07:29
  • @m0skit0 - I've added a reference to the zero memory process in Windows OS. – botismarius Apr 19 '15 at 07:37
0

What I highly suspect is that parts, information blobs are not promptly erased from RAM unless there's a new request for more RAM from actually running code. For that part what probably happens is OS reusing OS dependent bits from RAM, on a next execution e.g. I think this is true for OS initiated resources (and probably not for all resources but some).

0

Actually most of your questions are highly implementation-dependant. But for most used OS:

What happens when a program exits? Does the binary stay in memory?

Yes, but the memory blocks are marked as unused (and thus could be allocated to other processes).

Would subsequent invocations of the program find that the code is already in memory and thus not have page faults (assuming nothing runs in between and pages stuff out to disk)?

No, those blocks are considered empty. Some/all blocks might have been overwritten already.

Why doesn't executable get to stay in memory?

Why would it stay? When a process is finished, all of its allocated resources are freed.

m0skit0
  • 25,268
  • 11
  • 79
  • 127
-1

One of the reasons is that one generally wants to clear everything out on a subsequent invocation in case their was a problem in the previous.

Plus, the writeable data must be moved out.

That said, some systems do have mechanisms for keeping executable and static data in memory (possibly not linux). For example, the VMS operating system allows the system manager to install executables and shared libraries so that they remain in memory (paging allowed). The same system can be used to create create writeable shared memory allowing interprocess communication and for modifications to the memory to remain in memory (possibly paged out).

user3344003
  • 20,574
  • 3
  • 26
  • 62