6

Is there a reasonable way for a Linux userspace program to enable/disable cache write combining for a memory page that it owns?

The two target systems I care about: Intel Haswell processor on a 3.0 kernel, and Intel Skylake processor on a 4.8 kernel.

I'm tuning a mature, multi-threaded application that uses large buffers to transfer data between a producer and a consumer. Based on profiling, I have reason to believe that the application would benefit from the buffers' pages sometimes using write-combining caching, rather than write-back caching.

I considered instead using non-temporal writes to populate the buffer, but it would require a larger code refactoring than is possible for my current effort.

This question, this question, and this LWN article discuss the issue, but from the perspective of a device driver. In my case, I'm working with userspace code, running as non-root.

This 2008 paper discusses the different API's for controlling a page's caching mode. It seems to indicate that a userspace application can obtain write-combining access to a page using mmap (see sections 5.3, 5.4 and 5.6), but the documentation isn't clear (to me, at least) regarding exactly how to use those mechanisms.

Community
  • 1
  • 1
Christian Convey
  • 1,202
  • 11
  • 19
  • 1
    Sections 5.3 5.4 of https://www.kernel.org/doc/ols/2008/ols2008v2-pages-135-144.pdf are for access via `/proc` / `/sys` fs to PCI resources; 5.5 and 5.6 are for access to all computer memory with `/dev/mem`. Both requires root access and direct access to `/dev/mem` is unsafe. Try to use `non-temporal writes`, at least to compare is there any benefit to change write combining mode for code. Intel has hardware detectors of memory filling which may change combining mode, http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf 3.6.10 7.4.1 – osgx Mar 06 '17 at 14:57
  • @osgx So user space PAT/MTRR programming is generally discouraged...? – St.Antario Feb 10 '20 at 22:50

1 Answers1

4

I had a similar requirement recently where i needed to experiment with uncached memory in a cache-heavy multi-threaded application.

I came up with this kernel module which allows to map uncached memory in userspace. So it's a little different from what you're asking but maybe you can tweak it to achieve your goal.

Make it call:

  • set_memory_wc() instead of set_memory_uc() and
  • pgprot_writecombine() instead of pgprot_uncached()

and you should get write-combining memory.

At the moment you have to mmap() the module's character device (see test directory for demo) and memory type is fixed, but shouldn't be too hard to add an ioctl to toggle it.

I haven't tried changing attributes of existing userspace pages yet, would make it much nicer to use !

lemonsqueeze
  • 1,041
  • 8
  • 19