2

Will the following code load data from file into system memory so that access to the resulting pointer will never block threads?

auto ptr = VirtualLock(MapViewOfFile(file_map, FILE_MAP_READ, high, low, size), size); // Map file to memory and wait for DMA transfer to finish.
int val0 = reinterpret_cast<int*>(ptr)[0]; // Will not block thread?
int val1 = reinterpret_cast<int*>(ptr)[size-4]; // Will not block thread?

VirtualUnlock(ptr);
UnmapViewOfFile(ptr);

EDIT:

Updated after Dammons answer.

auto ptr = MapViewOfFile(file_map, FILE_MAP_READ, high, low, size);

#pragma optimize("", off)
char dummy;
for(int n = 0; n < size; n += 4096)
    dummy = reinterpret_cast<char*>(ptr)[n];
#pragma optimize("", on)

int val0 = reinterpret_cast<int*>(ptr)[0]; // Will not block thread?
int val1 = reinterpret_cast<int*>(ptr)[size-4]; // Will not block thread?

UnmapViewOfFile(ptr);
ronag
  • 49,529
  • 25
  • 126
  • 221
  • Are you asking what typically happens or what is absolutely guaranteed? And by "block a thread" do you mean that disk I/O is required? – David Schwartz Aug 15 '12 at 12:18
  • What typically happens. Yes, when disk I/O is required the thread is blocked. – ronag Aug 15 '12 at 13:16
  • This typically invokes the "Windows is not a real-time operating system" comment. You'll need to keep it locked. And its next to impossible to get the optimizer to not remove those statements, you must use #pragma optimize to stop it being helpful. – Hans Passant Aug 15 '12 at 13:46
  • On the function, not inline inside the function body. – Hans Passant Aug 15 '12 at 13:59

1 Answers1

2

If the file's size is less than the ridiculously small maximum working set size (or, if you have modified your working set size accordingly) then in theory yes. If you exceed your maximum working set size, VirtualLock will simply do nothing (that is, fail).

(In practice, I've seen VirtualLock being rather... liberal... at interpreting what it's supposed to do as opposed to what it actually does, at least under Windows XP -- might be different under more modern versions)

I've been trying similar things in the past, and I'm now simply touching all pages that I want in RAM with a simple for loop (reading one byte). This leaves no questions open and works, with the sole possible exception that a page might in theory get swapped out again after touched. In practice, this never happens (unless the machine is really really low on RAM, and then it's ok to happen).

Damon
  • 67,688
  • 20
  • 135
  • 185
  • Updated my question, do you mean something like that? How do I keep the compiler from optimizing away the loop? – ronag Aug 15 '12 at 13:18
  • How do I get the proper "page size"/stride in windows? – ronag Aug 15 '12 at 13:29
  • One way of keeping the compiler from optimizing the loop away is summing up a value and then (after the loop) assigning the sum to a `volatile` variable. This is usually faster than making `dummy` itself `volatile` (which nevertheless is an option too). About the proper stride, the correct way of doing it is to use `dwPageSize` from `GetSystemInfo`. However, just assuming a page size of 4096 is "OK", because it works on all systems, the worst thing to happen is that you touch a few extra memory locations unnecessarily on architectures having larger page sizes. – Damon Aug 15 '12 at 13:55
  • According to my experience, you can in fact, although this is "incorrect", only touch every 8th page (i.e. use an 64k stride), because Windows will at least prefetch 8 pages at a time under XP (and more on later versions). Although it doesn't create the pages at that time, it still pulls them into the buffers. This means you _will_ get a fault when first accessing the odd pages (not when accessing the even ones) but this is a very low impact, not actually _loading_ anything. See [here](http://stackoverflow.com/q/5909345/572743) under "Sidenote". – Damon Aug 15 '12 at 13:58