3

In my answer to the question Assembly code to return smallest integer in array instead randomly returns either last or second to last number I presented an alternative using a cmovcc instruction. I stated there:

The cmov instruction seems to be supported by all AMD64 CPUs.

However, at the time I had not yet found conclusive sources to support that statement. So I thought to post this question to ask for that.

ecm
  • 2,583
  • 4
  • 21
  • 29

2 Answers2

5

Yes, in practice all x86-64 CPUs support cmovcc and it's widely assumed that it's safe to use without checking its CPUID feature bit. I.e. that the long-mode-supported feature bit implies it.


cmovcc was introduced with Intel P6 (PPro and so on) which predates x86-64, and supported on all later Intel CPUs (except Quark and KNC which were non-general-purpose designs based on P5, but neither of these are x86-641) cmovcc is also supported by AMD's first AMD64 CPUs (K8), and by K7 before that, but not K6. Via's x86-64 CPUs also support CMOV. There are no other x86-64 hardware vendors AFAIK, and software emulators all enable CMOV as part of x86-64.

Various other vendors sold CMOV-supporting 32-bit CPUs, including Cyrix 6x86MX/MII, possibly some update of Transmeta Crusoe's binary-translation layer, and Via C3 Nehemiah

Sources for 32-bit CMOV support on various CPUs: comments on Agner Fog's Stop the Instruction Set War blog post, reactOS compat list, and discussion on a fedora bug.

Footnote 1: Both Quark and KNC have since been discontinued. Quark was a plain 32-bit microcontroller. KNF/KNC powered first-gen Xeon Phi and is its own thing: not full x86-64 compatibility, e.g. no CMOV or SSE, only the predecessor of AVX512 that it supported. I assume it had some way to address more than 4GiB of RAM. The next gen Xeon Phi KNL/KNM is truly x86-64 (derived from Silvermont) with cmov and normal AVX + AVX512F. And has also been discontinued.)


Compilers for x86-64 all assume that it's safe to use cmov when making 64-bit code.

This is significant because compilers like gcc don't assume some early additions to x86-64 unless you use special options. e.g. lock cmpxchg16b (missing from early AMD) or lahf in long mode (missing from early Intel P4 that were 64-bit capable). The fact that GCC does assume cmov with the default -march=x86-64 indicates that universal support is assumed.

(GCC is normally configured with 32-bit mode codegen assuming Pentium Pro, though, also using cmov but not SSE1. e.g. return a ? b:c; compiles to cmov with gcc -m32 as old as 4.6 on Godbolt. That's definitely not baseline for 32-bit mode, and would fault on P5 Pentium and earlier. GCC is normally configured to target "i686" in 32-bit mode, but truly baseline x86-64 for 64-bit mode because that's still feature-full enough to not be terrible.)


I don't know where you'd find official confirmation that it's baseline, though; Intel's manual (https://www.felixcloutier.com/x86/cmovcc) does say this:

The CMOVcc instructions were introduced in P6 family processors; however, these instructions may not be supported by all IA-32 processors. Software can determine if the CMOVcc instructions are supported by checking the processor’s feature information with the CPUID instruction (see “CPUID—CPU Identification” in this chapter).

(The relevant CPUID feature bit is cpuid[EAX=1].EDX.bit15 (sandpile or with EAX=8000_0000h), which also indicates support for other P6 features like fcomi and fcmovcc if the x87 FPU is present, i.e. bit 0 of that same EDX output is set.)

I think that IA-32 wording implies that no IA-32e processors (Intel's name for x86-64) lack it, only some IA-32 processors. But it's not a very clear statement and I might be over-interpreting based on the fact that I know it's true in practice.


Another answer on this question points out SSE2. In practice all CPUs that support SSE2 also support cmov, but cmov isn't "part of SSE2". They have separate CPUID feature bits. (And both are baseline for x86-64 so 64-bit code doesn't need to check feature bits.)

Nothing would stop someone from building a CPU with SSE2 but not cmov ... except the fact that nobody would buy it because it couldn't run normal binaries. Many modern compilers use CMOV even in 32-bit mode even when they don't use SSE1 by default. (This might seem a bit silly; the amount of PPro / PII CPUs still in use is probably not much higher than P5 Pentium and compatible CPUs. But semi-modern AMD Geode has CMOV without SSE1. https://bugzilla.redhat.com/show_bug.cgi?id=538268#c9)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    A quick web search turned up https://c9x.me/x86/html/file_module_x86_id_45.html which does list a `cmov` feature bit for `cpuid` function 1, `edx` bit 15. – ecm Mar 19 '20 at 16:02
  • 1
    @ecm: yup, found it on sandpile.org, too. Updated. I incorrectly assumed that since the vol.2 manual didn't mention the CPUID feature bit that it was suggesting some other way of detecting it, e.g. via vendor = intel and fam=6 or 15. – Peter Cordes Mar 19 '20 at 16:21
  • 1
    note there is an assumed arch as part of the build of the gcc binary so saying default without -march doesnt mean much, since it depends on the distro or source of the binary. instead if your default arch is x then it is assumed to be supported. – old_timer Mar 19 '20 at 17:23
  • @old_timer: Right, I'm talking about gcc's `-march=x86-64` which is baseline x86-64 including no optional features, AFAIK. It's the default in most x86-64 GCC configs, including most mainstream Linux distros. Updated, thanks. – Peter Cordes Mar 19 '20 at 17:38
  • 1
    Yes I agree it is probably the default for most if not all places one is going to look, but technically the default is configured at compile time, that's what the comment is about...The update covers it, thanks. – old_timer Mar 19 '20 at 17:50
  • Via processors support cmov starting with C3 Nehemiah according to both the datasheets and cpuid dumps. I don't think there are any other processors from other vendors that support cmov AFAIK. – Hadi Brais Mar 19 '20 at 19:12
  • 2
    @HadiBrais: https://reactos.org/wiki/Supported_Hardware/CPU mentions Cyrix 6x86MX/MII support for CMOV and MMX. I knew to search for cyrix cmov based on discussion in https://bugzilla.redhat.com/show_bug.cgi?id=538268#c9. Oh also I found https://www.agner.org/optimize/blog/read.php?i=82&v=t which says Transmeta Crusoe supported CMOV at some point, according to a commenter in the "stop the instruction-set war" thread. – Peter Cordes Mar 19 '20 at 19:20
  • Good to know. Also [Intel Quark](https://en.wikipedia.org/wiki/Intel_Quark) doesn't support CMOV according to the manual. Note that saying that "all later CPUs" support CMOV (with respect to Pentium Pro?) is not accurate. – Hadi Brais Mar 19 '20 at 19:25
  • @HadiBrais: Oh hmm, good point; updated. New designs derived from P5 exist, like Quark and KNC. https://software.intel.com/en-us/forums/intel-isa-extensions/topic/277630 says KNC omits CMOV, and even omits AVX. (Unlike KNL where you can use AVX to access the low 256 / 128 of ZMM registers, even though it doesn't have AVX512VL). – Peter Cordes Mar 19 '20 at 19:56
0

Here on stackoverflow I did find an answer to Generating CMOV instructions using Microsoft compilers which addresses my question in two places. That is this:

However, the documentation does confirm that, as one might expect, enabling SSE or SSE2 code generation implicitly enables the use of conditional-move instructions and anything else that was introduced before SSE:

In addition to using the SSE and SSE2 instructions, the compiler also uses other instructions that are present on the processor revisions that support SSE and SSE2. An example is the CMOV instruction that first appeared on the Pentium Pro revision of the Intel processors.

And this:

No special compiler flags or other considerations are necessary here, since all processors that support 64-bit mode support conditional moves.

Regarding that first quote tower, if SSE2 support implies cmov support, then the 64-bit ISA extension also implies cmov as it is well known that x86-64 always supports SSE2, as stated in the Wikipedia article:

The AMD64 architecture supports the IA-32 as a compatibility mode and includes the SSE2 in its specification.

ecm
  • 2,583
  • 4
  • 21
  • 29