8

I'm trying to use the WBINV instruction on linux to clear the processor's L1 cache.

The following program compiles, but produces a segmentation fault when I try to run it.

int main() {asm ("wbinvd"); return 1;}

I'm using gcc 4.4.3 and run Linux kernel 2.6.32-33 on my x86 box.

Processor info: Intel(R) Core(TM)2 Duo CPU T5270 @ 1.40GHz

I built the program as follows:

$ gcc

$ ./a.out

Segmentation Fault

Can somebody tell me what I'm doing wrong? How do I get this to run?

P.S: I'm running a few performance tests and want to ensure that the previous content of the processor cache does not influence the results.

artless noise
  • 21,212
  • 6
  • 68
  • 105
roelf
  • 361
  • 2
  • 4
  • 5
  • 3
    Why do you want to flush the CPU cache? As said in the answer, you can't use that instruction, but if you tell us what your goal is, it's possible we can suggest another way to achieve it – jalf Jul 19 '11 at 10:28
  • I'm running a few performance tests and want to ensure that the previous content of the processor cache does not influence the results. – roelf Jul 19 '11 at 10:38
  • 4
    It's usually more useful to go the opposite direction. Run your benchmark multiple times, so you're guaranteed that the data is *already* in cache. That's the most realistic scenario anyway, so it's the one that makes sense to benchmark. Artificially clearing the cache wouldn't give you more accurate results, it'd just make your benchmarks vary more depending on hardware. – jalf Jul 19 '11 at 10:56
  • Possible duplicate of http://stackoverflow.com/questions/1756825/cpu-cache-flush – Gunther Piez Jul 19 '11 at 12:19
  • 1
    at least on Ubuntu 12.04, running your program with `sudo` (i.e. as root user) does **not** result in a segmentation violation (but it's not clear to me whether the cache is actually cleared...) – Andre Holzner Sep 09 '13 at 12:54

2 Answers2

13

Quoting from Intel® 64 and IA-32 Architectures Software Developer's Manual Combined Volumes 2A and 2B: Instruction Set Reference, A-Z:

The WBINVD instruction is a privileged instruction. When the processor is running in protected mode, the CPL of a program or procedure must be 0 to execute this instruction.

In other words only kernel mode code is allowed to execute it.

EDIT: Previous SO discussion on clearing caches:

"C" programmatically clear L2 cache on Linux machines

How can I do a CPU cache flush in x86 Windows?

How to clear CPU L1 and L2 cache

https://stackoverflow.com/questions/3443130/how-to-clear-cpu-l1-and-l2-cache

Community
  • 1
  • 1
user786653
  • 29,780
  • 4
  • 43
  • 53
  • Thanks for that! Is there any alternative instruction that I can use in userspace to clear the processor cache? – roelf Jul 19 '11 at 10:44
  • @roelf: Added a few links to previous SO discussions. Also you could of course write a device driver that uses `WBINVD` to clear the caches for you. – user786653 Jul 19 '11 at 10:52
  • @roelf: generally speaking, what would be the point of creating a privileged instruction which does the same thing as a non-privileged one? The reason this instruction is privileged is because it isn't really something you want user code to do. ;) – jalf Jul 19 '11 at 10:55
7

As user786653 wrote, wbinvd it is an privileged instruction, which segfaults in non-kernel code.

You should avoid using wbinvd for benchmarking, because it forces all kind of bus locking cycles, pipeline serializing and adds the overhead from kernel to userspace etc., which most likely do not happen in you real world program.

Hence your measurement will not be more exact, it will contain all kinds of artifacts. Reading a data chunk in the size of the L2 cache will produce better results.

You can read the source code under Test programs for measuring clock cycles and performance monitoring to see how others got useful results.

Gunther Piez
  • 29,760
  • 6
  • 71
  • 103