1

I was reading through the Go source code, as one does, and as I was reading the fastrand() function, which for my machine would be in the asm_amd64.s file, I came across this snippet:

    XORL    $0x88888eef, DX
    CMOVLMI BX, DX
    MOVL    DX, m_fastrand(AX)

For the life of me, I cannot figure out what CMOVLMI is supposed to be doing. A search for it reveals that only Go seems to know anything about it; I can find plenty of CMOVxx opcodes defined in the AMD X86_64 reference, and the Wikipedia Page has a long history of conditional move instructions, but this doesn't appear anywhere on that list.

Where is CMOVLMI defined? Is it unique to Go's internal assembler?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Elf Sternberg
  • 16,129
  • 6
  • 60
  • 68
  • It's probably CMOVS with 32-bit operands. That is `CMOVS EDX, EBX` in standard Intel assembly. I don't know if Go bothers to document their bizarre assembly language anywhere. – Ross Ridge Dec 09 '16 at 18:04
  • 1
    What does it disassemble to if you make a binary and use a normal disassembler for either NASM or GAS syntax? (e.g. `objdump -drwC foo.o` to get AT&T syntax, which is closest to what it looks like Go uses). L is a valid condition-code ("less than"), and it's also used as an operand-size suffix in AT&T syntax (and in this Go asm). Confusingly, Go appears to always use AX, not EAX or RAX, even with 32-bit operand size. MI doesn't make any sense to me (since I don't know Go asm), but LM and MI aren't valid condition codes. http://www.felixcloutier.com/x86/CMOVcc.html – Peter Cordes Dec 09 '16 at 18:05
  • 3
    @PeterCordes I'm guessing the L suffix means "long" like the other instructions and the MI suffix means "minus". – Ross Ridge Dec 09 '16 at 18:12
  • What's the point of yet another x86 assembly syntax? Any good reason? – Ped7g Dec 09 '16 at 18:17
  • 5
    @Ped7g Go invented it's own assembly syntax which they use across all platforms. Presumably it was done to make writing the compiler easier, though it comes at the cost of having to port the assembler too. – Ross Ridge Dec 09 '16 at 18:26
  • @RossRidge But there's already one generic assembly syntax, over almost all platforms... C++. Doesn't make any sense to me. Whatever, it's their problem. – Ped7g Dec 09 '16 at 19:10
  • 2
    @Ped7g the video [Rob Pike - The Design of the Go Assembler](https://www.youtube.com/watch?v=KINIAgRpkDA) explains some of the motivation behind the assembly syntax – Mark Dec 12 '16 at 20:54
  • @Mark hm, so they dumbed down C for a false feel of being at machine code level. May work well for their needs, but the claim of being universal assembly is just laughable, the difference for example between x86 and MIPS is much more than syntax of `add`. On x86 you have for example rich flag register functionality (none on MIPS). They may look same, if you compare C++ produced output, because C++ never exploited the CPU fully. The only point is that they can produce new architecture pseudo-assembler more quickly. But if you want to use that CPU fully, you still need proper Assembler later. – Ped7g Dec 13 '16 at 00:49

1 Answers1

5

The Go assemblers are derived from the Plan 9 assemblers with little changes. The design concept of the Plan 9 assemblers is that they were supposed to have common syntax and naming conventions across all architectures. While making assembly code more consistent within the framework of the Go toolchain, it can at time be very confusing to read such assembly code for people more familiar with conventional assemblers.

As for the instruction in question, CMOVLMI BX, DX, specifically; it demonstrates some of the peculiar design choices of the Go assembler. The mnemonic CMOVLMI has to be read like an ARM mnemonic where CMOV is the operation, L is the operand size (long word, 32 bit) and MI is the condition on which it is executed (minus, i.e. sign flag set). The operand size follows the established DEC conventions where B, W, L, Q, and O stand for byte, word, long word, quad word, and octa word respectively. The condition codes follow M68k conventions; here is a handy translation table:

Go syntax  Intel syntax  read
---------  ------------  ----
OS         o             Overflow Set
OC         no            Overflow Clear
CS, LO     b, c, nae     Carry Set / LOwer
CC, HS     nb, nc, ae    Carry Clear / Higher or Same
EQ         e, z          EQual
NE         ne, nz        Not Equal
LS         be, na        Lower or Same
HI         nbe, a        Higher
MI         s             MInus
PL         ns            PLus
PS         p, pe         Parity Set
PC         np, po        Parity Clear
LT         l, nge        Less Than
GE         nl, ge        Greater or Equal
LE         le, ng        Less or Equal
GT         nle, g        Greater Than

The mnemonics LO and HS are swapped for targets where carry is the inverse of borrow, like ARM. For jump instructions, the Intel syntax variants are recognised as alternative mnemonics to ease the transition. This is however not the case for other instructions.

Additionally, the Go assembler does not distinguish general purpose register sizes by given the different register sizes different names (except for AL, BL, CL, and DL supported for consistency with AH, BH, CH, and DH). The register BX can refer to any of bl, bx, ebx, and rbx depending on the instruction's operand size.

Lastly, operand ordering follows AT&T conventions, i.e. source, then destination.

The instruction thus corresponds to the Intel instruction

cmovs edx, ebx

To compare the different representations, the objdump utility shipped with the Go toolchain supports a -gnu flag. This dumps instructions in GNU syntax in addition to Plan 9 syntax, making it easy to compare the two.

fuz
  • 88,405
  • 25
  • 200
  • 352