0

I'm trying to use cache as a temporary memory. And after using the cache I do not want to store any of the modified cache lines. I came to know that I can achieve that by running invd instruction. Because unlike wbinvd, invd Invalidates (flushes) the processor's internal caches without storing them into the main memory.

I wrote a kernel module to check if I can execute the invd instruction.

#include <linux/module.h>     /* Needed by all modules */
#include <linux/kernel.h>     /* Needed for KERN_INFO */
#include <linux/init.h>       /* Needed for the macros */

int new_invd(void){
    asm volatile ("invd" : : : "memory");
    return 1;
}

static int __init hello_start(void)
{
    printk(KERN_INFO "Loading hello module...\n");

    //check if invd instruction executes  
    printk(KERN_INFO "running invd\n", new_invd());

    return 0;
}

static void __exit hello_end(void)
{
    printk(KERN_INFO "Goodbye\n");
}

module_init(hello_start);
module_exit(hello_end);

Upon compiling and inserting the module I get Segmentation fault (core dumped) and dmesg shows,

[ 7525.227059] Loading hello module... [ 7525.227088] general protection fault: 0000 [#1] SMP

I use asm volatile ("invd" : : : "memory"); as mentioned in chromium. Now I think I'm getting the error because executing invd violates the coherency of main memory and cache as pointing out here How can I do a CPU cache flush in x86 Windows? by @Gunther Piez. However, I'm not sure if this is the case.

So, any help why I'm getting this segfault? If this is due to the violating cache corehency, how can I fix that? If not, how can I execute invd?

I'm using Linux xxx 4.4.0-200-generic #232-Ubuntu SMP Wed Jan 13 10:18:39 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

cat /proc/cpuinfo shows,

vendor_id   : GenuineIntel
cpu family  : 6
model       : 158
model name  : Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
0andriy
  • 4,183
  • 1
  • 24
  • 37
user45698746
  • 305
  • 2
  • 13
  • You can't safely execute `invd` after Linux boots, especially not in an SMP system where other cores might be doing anything (i.e. have some dirty cache lines). Discarding all recent stores is obviously going to break stuff everywhere. – Peter Cordes Feb 10 '21 at 23:42
  • @PeterCordes Is still true if other cores are in no-fill mode? – user45698746 Feb 10 '21 at 23:44
  • @PeterCordes, are there any alternative of `invd` to achieve the same goal? – user45698746 Feb 10 '21 at 23:45
  • 1
    There is no support for discarding contents of a specific buffer, and `invd` is I think slow anyway. Plus the cost of sleeping all other cores (on CPU models where deep sleep flushes private caches) and somehow making sure even L3 cache is synced, like maybe `wbinvd` (also extremely slow) before your run this temp-buffer thing + `invd` (with interrupts disabled and other cores asleep the whole time). It's almost certainly much better for performance to just let the HW write back your buffer eventually. Keep it small and reuse it and maybe it won't be written back between uses. – Peter Cordes Feb 10 '21 at 23:50
  • 1
    Find a CPU with CAT technology supported. There is (was?) even driver in Linux kernel for that. https://software.intel.com/content/www/us/en/develop/articles/introduction-to-cache-allocation-technology.html – 0andriy Feb 11 '21 at 17:11

1 Answers1

1

You can't safely execute invd after Linux boots, especially not in an SMP system where other cores might be doing anything (i.e. have some dirty cache lines). Discarding all recent stores is obviously going to break stuff everywhere, just like if cosmic rays flipped a bunch of bits in cache or RAM.

Your inline asm statement is correct if you do want to corrupt your system by running invd.

There is no support for discarding contents of a specific buffer, and invd is I think slow anyway. Plus the cost of sleeping all other cores (on CPU models where deep sleep flushes private caches) and somehow making sure even L3 cache is synced, like maybe wbinvd (also extremely slow) before your run this temp-buffer thing + invd (with interrupts disabled and other cores asleep the whole time).

It's almost certainly much better for performance to just let the HW write back your buffer eventually. Keep it small and reuse it and maybe it won't be written back between uses.

I don't think there's any way to safely use invd that could be a performance win for anything. Probably not even if you only care about single-core performance. (NT stores and/or clflush could make some stores go to DRAM before an invd so it's barely plausible to use it, but it's probably so expensive that it's not at all worth it.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Thanks. I'm not worried about performance. Because my aim was just to validate, whatever I'm running inside the cache doesn't show up in the main memory. – user45698746 Feb 11 '21 at 00:04
  • 1
    @user45698746: If you're running in no-fill mode for this, then just `memset(0)` over the buffer before re-enabling write-back + fill. – Peter Cordes Feb 11 '21 at 00:12
  • Related: [How to explicitly load a structure into L1d cache? Weird results with INVD with CR0.CD = 1 on isolated core with/without hyperthreading](https://stackoverflow.com/q/66772632) was a followup to this, isolating a core. – Peter Cordes Mar 05 '22 at 01:55