14

Sometimes GCC generates this instruction when compiling with -march=atom. Does each and every Intel Atom CPU support MOVBE?

What other processors support this instruction? I can't seem to find this information on Intel website. Please help.

hippietrail
  • 15,848
  • 18
  • 99
  • 158
Gart
  • 2,297
  • 1
  • 28
  • 36
  • [Athlon X4 845](http://www.cpu-world.com/CPUs/Bulldozer/AMD-Athlon%20X4%20845.html) a.k.a Bulldozer has it too. – jww Jan 24 '17 at 06:48
  • @jww: Athlon X4 845 is [Excavator](https://en.wikipedia.org/wiki/Excavator_(microarchitecture)) (final generation Bulldozer-family), not Bulldozer. Steamroller and earlier don't have MOVBE. http://instlatx64.atw.hu/ has CPUID listings and instruction microbenchmarks that show only Carrizo (Excavator) CPUs have it, not earlier in that family. – Peter Cordes Aug 24 '19 at 21:23
  • @Peter - I bought a bulldozer machine that has it. It is sitting in my basement. – jww Aug 24 '19 at 21:28
  • @jww: I highly doubt that, unless it's supported but not reported by CPUID. `0x1E98220B & (1<<21) = 0`. (That's `CPUID.01H:ECX.MOVBE[bit 22]` from the CPUID dump on a Bulldozer, specifically http://users.atw.hu/instlatx64/AuthenticAMD0600F12_K15_Zambezi6C_CPUID.txt). I'm more inclined to trust CPUID dumps from instlat than your memory about your basement. – Peter Cordes Aug 24 '19 at 21:38

3 Answers3

11

This instruction was originally unique to the Intel® Atom™ processor.

From Intel side:

The Intel® Compilers 11.0 allow you to target the Intel® Atom™ processor using the /QxSSE3_ATOM or -xSSE3_ATOM compiler options. These options enable the generation of the movbe instruction which is unique to the Intel® Atom™ processor.

In other microarchitectures (http://instlatx64.atw.hu/ with uop info from https://agner.org/optimize/):

  • Mainstream Intel: Haswell and later. Including Haswell Xeon (Ex-xxxx v3).
    Decodes as 2 or 3 uops, about the same as bswap + load or store.
  • Mainstream AMD: Excavator, and Ryzen-family. Steamroller and earlier don't have it.
    Decodes efficiently to a single uop.

non-mainstream CPUs:

  • Legacy in-order Intel Atom: all
  • Intel Silvermont-family out-of-order Atom: all. Decodes efficiently to a single uop.
  • AMD Jaguar. Decodes efficiently to a single uop.

  • Intel Xeon Phi: Knight's Landing (based on Silvermont) and later. (Maybe not on Knight's corner.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
GJ.
  • 10,810
  • 2
  • 45
  • 62
5

It appears that all Atom processors support MOVBE; at any rate, the first and least capable (the Atom 230) does. (See e.g. http://www.linuxquestions.org/questions/linux-hardware-18/proc-cpuinfo-output-816192/ for evidence.) I don't believe any non-Atom Intel processors support MOVBE; at any rate, recent Core i7 processors appear not to (see e.g. http://www.techsupportforum.com/forums/f108/i7-running-on-3-of-8-threads-522063.html and search for "movbe" for evidence).

You can detect MOVBE support at runtime using CPUID.

Gareth McCaughan
  • 19,888
  • 1
  • 41
  • 62
  • Interesting. [Agner Fog's instruction tables](https://agner.org/optimize/) only list it for Silvermont and later, but that `/proc/cpuid` confirms that in-order Atom 230 had it too. https://ark.intel.com/content/www/us/en/ark/products/35635/intel-atom-processor-230-512k-cache-1-60-ghz-533-mhz-fsb.html. Apparently Agner didn't add it to his tests until then. – Peter Cordes Aug 24 '19 at 21:11
2

Based on /proc/cpuinfo, the new Xeon E3 XXXX v3 also support MOVBE

Source:

http://openbenchmarking.org/s/Intel%20Xeon%20E3-1230%20v3

Andre de Miranda
  • 726
  • 7
  • 17