46

I would like to programmatically disable hardware prefetching.

From Optimizing Application Performance on Intel® Core™ Microarchitecture Using Hardware-Implemented Prefetchers and How to Choose between Hardware and Software Prefetch on 32-Bit Intel® Architecture, I need to update the MSR to disable hardware prefetching.

Here is a relevant snippet:

"DPL Prefetch and L2 Streaming Prefetch settings can also be changed programmatically by writing a device driver utility for changing the bits in the IA32_MISC_ENABLE register – MSR 0x1A0. Such a utility offers the ability to enable or disable prefetch mechanisms without requiring any server downtime.

The table below shows the bits in the IA32_MISC_ENABLE MSR that have to be changed in order to control the DPL and L2 Streaming Prefetch:

Prefetcher Type MSR (0x1A0) Bit Value 
DPL (Hardware Prefetch) Bit 9 0 = Enable 1 = Disable 
L2 Streamer (Adjacent Cache Line Prefetch) Bit 19 0 = Enable 1 = Disable"

I tried using http://etallen.com/msr.html but this did not work. I also tried using wrmsr in asm/msr.h directly but that segfaults. I tried doing this in a kernel module ... and killed the machine.

BTW - I am using kernel 2.6.18-92.el5 and it has MSR linked in the kernel:

$ grep -i msr /boot/config-$(uname -r)
CONFIG_X86_MSR=y
...
Cœur
  • 37,241
  • 25
  • 195
  • 267
Carlos
  • 1,455
  • 2
  • 14
  • 15
  • This is going to be painful to do, and send your performance to hell (well, your app will presumably do explicit prefetching -- but will anything *else* on the machine, like the kernel?). Note that that article about choosing between prefetch techniques mentions only the P4; newer chips are very different from NetBurst! This makes me wonder if you're *sure* that you *have* to do this, or if you're just fumbling around something else. – kquinn Apr 24 '09 at 09:27
  • My actual goal here is to determine the amount of useful prefecthing by comparing the bus bandwidth usage (BUS_TRAN_BURST.SELF events) with and without prefetching. – Carlos Apr 24 '09 at 17:40
  • Sorry for my ignorance (never did things at the kernel level) but I was under the impression that it would be a Bad Thing(tm) to disable prefetching, i.e. it's there for a reason so don't mess with it.... – Michael Todd Apr 24 '09 at 22:30
  • .globl _start .text _start: pusha mov msr_pf,%ecx // OF 32 rdmsr mov %edx, hi mov %eax, lo popa mov $1,%eax ; // terminate process mov $0,%ebx ; // result status int $0x80 ; // system call .data .align 8, 0xff lo: .word 0 hi: .word 0 msr_pf: .word 0x1A0 save all that in a file: rdmsr.s Then: as rdmsr.s -o rdmsr.o ld rdmsr.o -o rdmsr If you could run that in ring 0, it would work just fine. – Chris Apr 26 '09 at 00:15
  • So your premise is that extra memory which is prefetched is actually not useful?
    Intel discusses this at length: http://software.intel.com/en-us/articles/how-to-choose-between-hardware-and-software-prefetch-on-32-bit-intel-architecture/
    – Chris Apr 26 '09 at 00:30
  • I am not assuming that it is not useful. I just want to measure the amount of useful/useless HW prefetch and identify any regions where the code can be modified to improve the effectiveness of the HW prefetchers. – Carlos Apr 27 '09 at 21:01

4 Answers4

28

You can enable or disable the hardware prefetchers using msr-tools http://www.kernel.org/pub/linux/utils/cpu/msr-tools/.

The following enables the hardware prefetcher (by unsetting bit 9):

[root@... msr-tools-1.2]# ./wrmsr -p 0 0x1a0 0x60628e2089 
[root@... msr-tools-1.2]# ./rdmsr 0x1a0 
60628e2089

The following disables the hardware prefetcher (by enabling bit 9):

[root@... msr-tools-1.2]# ./wrmsr -p 0 0x1a0 0x60628e2289 
[root@... msr-tools-1.2]# ./rdmsr 0x1a0 
60628e2289

Programatically, you can do this as root by opening /dev/cpu/<cpunumber>/msr and using pwrite to write to the msr "file" at the 0x1a0 offset.

Zach Johnson
  • 23,678
  • 6
  • 69
  • 86
Carlos
  • 1,455
  • 2
  • 14
  • 15
  • 5
    Thank you Carlos! It may bee needed to do "modprobe msr" before. – JohnTortugo Jun 08 '13 at 20:36
  • 1
    I got `wrmsr: pwrite: Operation not permitted`. Do you know how to workaround it? – St.Antario Feb 02 '20 at 13:26
  • msr 0x1a0 only works for original Core 2 processor, from Nehalem Microarchitecture, it changed to msr 0x1a4, bits 0-3 set to 1 can disable L2 Hardware Prefetcher/L2 Adjacent Cache Line Prefetcher/DCU Hardware Prefetcher/DCU IP Prefetcher respectively, just search `Prefetcher Disable` in intel SDM vol4 – TingQian LI Jun 03 '23 at 23:15
13

From the Intel reference:
This instruction must be executed at privilege level 0 or in real-address mode; otherwise, a general protection exception #GP(0) will be generated. Specifying a reserved or unimplemented MSR address in ECX will also cause a general protection exception.

...
The CPUID instruction should be used to determine whether MSRs are supported (EDX[5]=1) before using this instruction.

So, your fault might be related to a cpu that doesn't support MSRs or using the wrong MSR address.

There are lots of examples of using the MSRs in the kernel source:

In the kernel source, for a single cpu, it demonstrates disabling prefetch for the Xeon in arch/i386/kernel/cpu/intel.c, in the function:

static void __cpuinit Intel_errata_workarounds(struct cpuinfo_x86 *c)

The rdmsr function arguments are the msr number, a pointer to the low 32 bit word, and a pointer to the high 32 bit word.
The wrmsr function arguments are the msr number, the low 32 bit word value, and the high 32 bit word value.

multi-core or smp systems have to pass the cpu struct in as the first argument:
void rdmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h);
void wrmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h);

Chris
  • 4,852
  • 1
  • 22
  • 17
  • 1
    It seems that my kernel (2.6.18-92.el5) does not have rdmsr_on_cpu or wrmsr_on_cpu in msr.h. Was this added in 2.6.19? – Carlos Apr 24 '09 at 20:22
  • 1
    It was right after 2.6.18 was chosen for Debian, the patch was introduced in january-2007 according to lkml.org: http://lkml.org/lkml/2007/1/18/91 – Chris Apr 28 '09 at 03:22
4

In 2014 Intel published info about h/w prefetcher disabling with 0x1a4 msr (1a4 msr) for Nehalem, Westmere, Sandy Bridge, Ivy Bridge, Haswell, Broadwell (and probably newer cores). Link was found by bholanath here:

https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors Disclosure of H/W prefetcher control on some Intel processors - Vish Viswanathan (Intel), September 24, 2014

This article discloses the MSR setting that can be used to control the various h/w prefetchers that are available on Intel processors based on the following microarchitectures: Nehalem, Westmere, Sandy Bridge, Ivy Bridge, Haswell, and Broadwell.

The above mentioned processors support 4 types of h/w prefetchers for prefetching data. There are 2 prefetchers associated with L1-data cache (also known as DCU DCU prefetcher, DCU IP prefetcher) and 2 prefetchers associated with L2 cache (L2 hardware prefetcher, L2 adjacent cache line prefetcher).

There is a Model Specific Register (MSR) on every core with address of 0x1A4 that can be used to control these 4 prefetchers. Bits 0-3 in this register can be used to either enable or disable these prefetchers. Other bits of this MSR are reserved.

They are local to every CPU core and can be changed by root with help of msr linux kernel driver. They are used by Intel to measure memory latency in NUMA with Intel MLC tool:

For example, Intel Memory Latency Checker tool (http://www.intel.com/software/mlc) modifies the prefetchers through writes to MSR 0x1a4 to measure accurate latencies and restores them to the original state on exit.

Community
  • 1
  • 1
osgx
  • 90,338
  • 53
  • 357
  • 513
2

I am adding an answer here, because the previous ones may not be applicable to all Intel processors.

For my Intel Xeon 5650 (06_2CH family) processor the manual chapter 35 specifies that bits 10 to 8 of the register IA32_MISC_ENABLE at adress 0x1A0 are reserved. I guess that this means I can't toggle prefetcher on and off trhough MSR.

According to an answer from an Intel employee here: "Intel has not disclosed how to disable the prefetchers on processors from Nehalem onward.You'll need to disable the prefetchers using options in the BIOS."

Manuel Selva
  • 18,554
  • 22
  • 89
  • 134
  • 2
    Reading that post, I see one of the comments linked to this new document which may apply (with a different MSR address) - https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors – Leeor Aug 11 '15 at 06:39