8

Background: I've implemented a stochastic algorithm that requires random ordering for best convergence. Doing so obviously destroys memory locality, however. I've found that by prefetching the next iteration's data, the performance drop is minimized.

I can prefetch n cache lines using _mm_prefetch in a simple, mostly OS+compiler-portable fashion - but what's the length of a cache line? Right now, I'm using a hardcoded value of 64, which seems to be the norm nowadays on x64 processors - but I don't know how to detect this at runtime, and a question about this last year found no simple solution.

I've seen GetLogicalProcessorInformation on windows but I'm leery of using such a complex API for something so simple, and that won't work on macs or linux anyhow.

Perhaps there's some entirely other API/intrinsic that could prefetch a memory region identified in terms of bytes (or words, or whatever) and allows me to prefetch without knowing the cache line length?

Basically, is there a reasonable alternative to _mm_prefetch with #define CACHE_LINE_LEN 64?

Community
  • 1
  • 1
Eamon Nerbonne
  • 47,023
  • 20
  • 101
  • 166
  • Duplicate: http://stackoverflow.com/questions/794632 – Paul R Oct 20 '10 at 16:05
  • 2
    I realize that question exists - however, there's no answer there to my question, and it's far more general (I only care about x64 platforms on which _mm_prefetch exists and only for prefetching purposes). Perhaps this may be solvable without explicitly getting the cache line length. No, I'm not very hopeful here... – Eamon Nerbonne Oct 20 '10 at 16:28
  • @EamonNerbonne: If you're only asking about the x64 architecture, put that in your question. – Ben Voigt Feb 16 '12 at 20:42
  • ...it is; both in tag and in text. – Eamon Nerbonne Feb 17 '12 at 11:33

1 Answers1

4

There's a question asking just about the same thing here. You can read it from the CPUID if you feel like delving into some assembly. You'll have to write platform specific code for this of course.

You're probably already familiar with Agner Fog's manuals for optimization which gives the cache information for many popular processors. If you are able to determine the expected CPU's you'll encounter you can just hard-code the cache line sizes and look up the CPU vendor information to set the line size.

Community
  • 1
  • 1
Ron Warholic
  • 9,994
  • 31
  • 47