3

CPUID can be used as a serializing instruction as described here and here. What is the minimal/simplest asm syntax to use it in that way in C++?

// Is that enough?
// What to do with registers and memory?
// Is volatile necessary?
asm volatile("CPUID":::);
MSalters
  • 173,980
  • 10
  • 155
  • 350
Vincent
  • 57,703
  • 61
  • 205
  • 388
  • 3
    Technically, you can't do it at all since inline assembly is a compiler-specific thing that doesn't exist in standard C++. While we could *guess* which compiler you're most likely using, always include that information when asking questions about inline assembly. Also mentioning target ISA is usually a good idea, even it if also can be guessed in this case. – Some programmer dude Jan 29 '18 at 13:39
  • 3
    Use a builtin instead of asm. `cpuid` overwrites registers, at the very least you need to list `eax`,`ebx`,`ecx` and `edx` as clobbers. Yes, `volatile` is necessary. – Jester Jan 29 '18 at 13:39
  • 7
    `#include ` and use the `__cpuid()` function. – Kelvin Sherlock Jan 29 '18 at 13:41
  • What do you need it for? – Maxim Egorushkin Jan 29 '18 at 13:46
  • 1
    If you need to serialize `rdtsc`, use `lfence`. It's guaranteed by Intel to work (at least on Intel CPUs). See https://stackoverflow.com/questions/38994549/is-intels-timestamp-reading-asm-code-example-using-two-more-registers-than-are. (However, [on AMD it seems you need `mfence` instead of `lfence`](https://stackoverflow.com/questions/12631856/difference-between-rdtscp-rdtsc-memory-and-cpuid-rdtsc#comment84001167_12634857), so `cpuid` is apparently more portable) – Peter Cordes Jan 29 '18 at 14:48
  • Related: [Using inline assembly with serialization instructions](https://stackoverflow.com/q/48522628) / [Is there a cheaper serializing instruction than cpuid?](https://stackoverflow.com/a/75456027) - only `cpuid` (or the very recent `serialize`) are fully serializing, usable for cross-modifying code. `lfence` or `mfence;lfence` will drain the ROB, or store-buffer+ROB, before later instructions which is all you need if you aren't doing cross-modifying code. – Peter Cordes Feb 20 '23 at 07:40

1 Answers1

1

Is there a reason you're not using the fence operations? If the goal is to serialize a section of code you can do something like

 asm __volatile__ (
      " mfence \n"   // drain the store buffer
      " lfence \n"   // and wait for that instruction to retire, draining the ROB
      ::: "memory"); // block compile-time reordering.
Your code here
  asm __volatile__ (
      " mfence \n"
      " lfence \n"
      ::: "memory" );

This is about as much serialization as you get from cpuid in terms of memory and instruction reordering. But neither is officially a Serializing Instruction in Intel's technical terminology.

Software prefetches aren't guaranteed to be ordered wrt. fence instructions, so on paper at least, an earlier prefetcht0 could result in data arriving after the lfence. (But a prefetcht0 after an lfence can't execute until after the lfence finishes, because no instructions after an lfence get sent to execution units until all instructions earlier have retired. "completed locally" in Intel's documentation.)

lfence blocking instruction reordering is how Intel CPUs always work, but AMD only with an MSR setting. OSes that do Spectre mitigation set that MSR: Is LFENCE serializing on AMD processors?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Darakian
  • 619
  • 2
  • 9
  • 22
  • 1
    The intel manual says: _"MFENCE does not serialize the instruction stream."_ – Jester Jan 29 '18 at 14:01
  • "Perform a serializing operation on all load-from-memory and store-to-memory instructions that were issued prior to this instruction. Guarantees that every memory access that precedes, in program order, the memory fence instruction is globally visible before any memory instruction which follows the fence in program order." https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=fence&expand=3424 – Darakian Jan 29 '18 at 14:06
  • Yes, but that only deals with memory operations. _"Non-privileged serializing instructions -- CPUID, IRET, and RSM._" and _"The following instructions are memory-ordering instructions, not serializing instructions. These drain the data memory subsystem. They do not serialize the instruction execution stream. Non-privileged memory-ordering instructions -- SFENCE, LFENCE, and MFENCE."_ – Jester Jan 29 '18 at 14:07
  • 1
    @Jester: `lfence` is now (or will be) officially documented as serializing, following years of that being an implementation detail, but still seeing some Intel docs recommend or at least use `lfence; rdtsc`. At least *some* good came out of Spectre... (I'm not 100% sure it's as strongly serializing as `cpuid`, though). But anyway, you're right that `mfence` isn't serializing on the instruction stream, on paper or in practice. Using both `lfence` and `mfence` back to back doesn't seem useful. – Peter Cordes Jan 29 '18 at 14:23
  • Correction: `lfence` is serializing *on the instruction stream*, without flushing the store buffer. It's sufficient for `lfence; rdtsc`. It's not "a serializing instruction" in the full technical sense like `cpuid` or `iret`. And `mfence` is serializing on AMD according to documentation, and in practice also on some Intel CPUs like Skylake, possibly a microcode update to fix an erratum strengthened it to be much stronger than the Intel manual guarantees: [Are loads and stores the only instructions that gets reordered?](//stackoverflow.com/a/50496379) – Peter Cordes Jan 29 '19 at 23:10
  • And also: [Is LFENCE serializing on AMD processors?](//stackoverflow.com/q/51844886) yes, with Spectre mitigation active the relevant MSR will be set so LFENCE blocks speculative execution like on Intel. – Peter Cordes Jan 29 '19 at 23:11
  • Anyway, you forgot `"memory"` clobbers, so compile-time reordering of code into / out of the serialized block is still possible. – Peter Cordes Jan 29 '19 at 23:12