3

Imagine:

  • you have swap partition on a failing disk;
  • a process is idle and part of its memory gets swapped into this partition;
  • after some time it wakes up and kernel tries to load swapped pages back into the memory;
  • kernel detects unrecoverable read error.

I believe kernel should crash the process in this scenario.

I haven't simulated it yet. I just want to know some answers (possibly with instructions how to simulate it in Linux) and share my subsequent findings as well.

Maybe this scenario can explain few crashes on some legacy systems with failing storage hardware.

UPDATE

Simulating errors on specific swap volume can easily be done using dmsetup with error mapping target:

The only thing I didn't look in depth is how to control loading, swapping and re-loading virtual memory of specific process into specific error-mapped part of swap volume. Basically, I need to avoid any other processes using this erroneous swap except the process under simulation.

Community
  • 1
  • 1
uvsmtid
  • 4,187
  • 4
  • 38
  • 64

1 Answers1

1

"Poison" patch should handle your case: https://lwn.net/Articles/348886/

Dirty pages in the swap cache are handled in a delayed fashion. The dirty flag is cleared for the page and the page swap cache entry is maintained. On a later page fault the associated application will be killed.

Dima Tisnek
  • 11,241
  • 4
  • 68
  • 120
  • qarma, thanks for the suggestion. I looked into the article [HWPOISON patch](https://lwn.net/Articles/348886/). It deals with errors in physical memory rather than virtual memory swapping on disk. The topic is very related, but simulating errors as if they happened in physical memory does not test mechanisms of handling errors in virtual memory swapped to disk. – uvsmtid Jun 08 '14 at 07:26