FYI, to strip out the "noise" in gcc asm output (.cfi
directives, unused labels, etc.), see How to remove "noise" from GCC/clang assembly output?. TL:DR: just put your code on http://gcc.godbolt.org/
In the Intel software developer's manual, the mnemonics are represented all uppercase (which means they suggest using it?).
I think they use all-uppercase as an alternative to code-formatting, so they can write stuff like "Performs a bitwise AND operation" without confusion (because "bitwise and operation" is also a valid English phrase). They could have gone with "perform a bitwise and
operation", but they didn't.
Their use of all-caps for mnemonics and register names is not a suggestion that it's the right way to code. Intel's tutorials / articles often have chunks of asm code (like this one about detecting AVX). In Listing 1 of that, they use lowercase mnemonics and register names for everything except XGETBV
, which I think they do to highlight it, because that's the new instruction that they're teaching you about. Their intrinsics guide also uses lower-case mnemonics in the field showing which instruction a mnemonic maps to (compiler optimization can choose different insns...).
Mnemonics and register names are not case-sensitive. Most people prefer lower-case. Good style also includes indenting your operands to a consistent column, and same for comments, so your code doesn't look all ragged. And always indent your code more than labels, so you can quickly see labels (branch targets). For example, this bit of the source for a golfed (smallest code-size) Adler-32 in x86-64 machine code illustrates what I think is reasonably good style.
; optimized for code-size, not speed
slow_small_addler32:
xor eax,eax ; scratch reg for loading bytes
cdq ; edx: high=0
lea edi, [rdx+1] ; edi: low=1
;jrcxz .end ; We don't handle len=0. unlike rep, loop only checks rcx after decrementing
.byteloop:
lodsb ; upper 24b of eax stays zeroed (no partial-register stall on Intel P6/SnB-family CPUs, thanks to the xor-zeroing)
add edi, eax ; low += zero_extend(buf[i])
add edx, edi ; high += low
loop .byteloop ; only use LOOP for small code-size, not speed
.end:
;; exit when ecx = 0, eax = last byte of buf
;; lodsb at this point would load the terminating 0 byte, conveniently leaving eax=0
... more code
See also this NASM style guide, linked from the x86 tag wiki.
Is there an option with gcc that gives me uppercase mnemonics?
No. You could upcase instruction mnemonics and register names with sed
or something if you wanted to. (But you'd have to be careful not to munge symbol names, which are case sensitive).
The pattern-matching in the scripts that power http://gcc.godbolt.org/ could maybe be useful for this, but I haven't looked much at the js source.