1

I am not quite sure about which work will be done by CPU and which will be done by the OS when a page fault occurs. That's why I'm asking the following questions.

Consider a single-core CPU, with several processes running. When a page fault occurs, the OS would try to fetch the required page from disk to RAM, which will cost a long time. During this time period, can the CPU keep on executing? Or the CPU has to wait until the required page is loaded into RAM?

If the CPU can keep on executing without waiting for the required page, then thrashing may occur when there are too many processes. At some moment, most of the instructions that the CPU executes will cause page faults, then the most of the time spent will be waiting for OS loading the pages from disk to RAM. That's why thrashing occurs. May I know if my understanding is correct?

Thanks in advance.


Update: this website describes thrashing very well.

Ethan L.
  • 395
  • 2
  • 8

3 Answers3

2

The CPU doesn't know that it's "in" a page fault. CPUs aren't recursive!

When a 32-bit x86 CPU (for example) encounters a page fault, here's what it does (slightly simplified):

  • Set the value of CR2 to the address which caused the page fault.
  • Look at the Interrupt Descriptor Table and some other tables and find the address of the page fault handler (new CS, new EIP) and the kernel stack (new SS, new ESP).
  • Set the values of CS, EIP, SS, and ESP to the ones it just read.
  • Push the old SS, old ESP, EFLAGS, old CS and old EIP onto the stack.
  • Push the SS, ESP, EFLAGS, CS and EIP registers onto that stack.
  • Update the flags to say we're now in kernel mode.

That's all it does. Now, there is some data on the stack that the kernel uses when it wants to make the CPU go back to what it was doing before the page fault happened. But the kernel isn't obligated to use that data. It could go back somewhere entirely different, or it could never go back. It's up to the kernel. The CPU doesn't care.

A usual kernel will first save all the other registers (important!), look at the address, decide where to get the page, tell the disk to start fetching the page, make a note that the process is stopped because of a page fault, and then it will go and do something entirely different until the data comes back from the disk. It might run a different process, for example. If there are no processes left to run, it might turn off the CPU (yes, really).

Eventually the data comes back from the disk and the kernel sees that there's a process waiting for that data because of a page fault, and it updates the page table so the process can see the data, and it resets all the registers, including SS, ESP, EFLAGS, CS, and EIP. Now the CPU is doing whatever it was doing before.

The key point to notice is: the CPU only cares what's in its registers right now! It doesn't have a long-term memory. If you save the register values somewhere, you can make it stop doing whatever it was doing, and resume it later as if nothing ever happened. For example, there is absolutely no requirement that you have to return from function calls in the order they happened. The CPU doesn't care if you have a function that returns twice, for example (see setjmp), or if you have two coroutines and calling yield in one coroutine causes yield to return in the other one. You don't have to do things in stack order like you do in C.

user253751
  • 57,427
  • 7
  • 48
  • 90
0

In a cooperative multitasking OS the OS cannot initialize a context switch, so the CPU must wait for the page to be brought in.

Modern systems are preemptive multitasking systems. In this case the OS will most likely initiate a context switch and so other threads/processes will run on the CPU.

Thrashing is a concern when the amount of memory used far exceeds the capacity of the RAM. "Download more RAM" is a meme for a reason.

bolov
  • 72,283
  • 15
  • 145
  • 224
0

The CPU can keep on executing.

The CPU cannot, however, carry on execution of the thread that incurred the fault. That thread needs the fault to be resolved before the very next instruction can be executed. That is, it must block on the fault.

That many threads/processes may be blocked on fault handling is not in itself thrashing. Thrashing occurs when, in order to bring a page in, there are insufficient free page frames, so it is necessary to write a page out. But then, when the OS tries to find another thread to run, it picks the thread that owned a page it just wrote out, so it has to fault that page back in.

Thrashing is therefore a symptom of insufficient available real memory.

user14644949
  • 321
  • 1
  • 3