Assembler macros are purely text substitutions. If you don't use a macro, its contents don't have to be valid. And if it is used, it's only assembled at the place where it's used. (It's not like an inline function, it's like a C preprocessor macro).
The original file uses .intel_syntax noprefix
at the top, but then is full of insane code like
mov %ebx, [%ebx + %eax*4]
and movb %al,[%esi+%edi]
that still decorates register names with %
despite noprefix
, and more importantly still uses AT&T-style operand-size suffixes.
It's a mutant hybrid of Intel and AT&T syntax, no wonder some assemblers reject it.
See https://stackoverflow.com/tags/intel-syntax/info vs. https://stackoverflow.com/tags/att/info
On my Linux desktop, the original file assembles just fine with GNU Binutils as
, which I invoke gcc -m32 -c 6502asm_x86.S
. (I'm on Linux, so this is real GCC, specifically gcc --version
says gcc (GCC) 9.1.0 Copyright (C) 2019 Free Software Foundation, Inc.
etc. It uses as
. as --version
says "GNU assembler (GNU Binutils) 2.32")
I suspect you're on a Mac with Apple Clang. Your "cc (4.8.4)" looks more like a gcc version number, but GCC doesn't contain an assembler. It always uses an external one. And on a Mac, that may still be Clang/LLVM, not GNU Binutils.
On my Linux desktop, clang 8.0.1 rejects this file. It's much stricter about not accepting AT&T-isms in Intel mode, and doesn't support .intel_syntax prefix
at all, only intel noprefix
or att prefix
. After removing all the %
characters in the file, clang -m32 -c 6502asm_x86.S
gives the same error messages you showed:
6502asm_x86.S:121:5: error: invalid instruction mnemonic 'movw'
movw di, [ebp+4] # di = r6 = PC
^~~~
Fixing this mess:
If possible, use as
aka gas
from GNU binutils. But IDK if it supports MachO object files so that might not be an option for you. (Update: apparently you're on Linux trying to use an Android toolchain. That's also clang, but probably is creating ELF objects. So you could probably just use as
manually.)
To actually fix the source, remove all the operand-size suffixes, too, and let the register operand(s) imply the size.
That file does correctly use GAS .intel_syntax
operand-size overrides in cases like mov dword ptr [ebp+20], 0
when neither operand is a register so it needs the dword ptr
.
But you can't just remove the last character of every mnemonic: some instructions already omit it. (It looks like that file does so for dword operand-size, but redundantly specifies it for every instruction using byte or word operand-size.)
There are a few instructions that can still use (and sometimes need) a size suffix in Intel syntax, for example pushw immediate
. Some assemblers like NASM use push word 123
, but GAS .intel_syntax noprefix
uses pushw 123
. If there's a register or memory operand, though, that can imply the size. e.g. push di
is a word push, pop word ptr [ecx]
is a word pop. You also have suffixes on "string" instructions like movsb/w/d
/ lodsb/w/d
, and so on.
e.g.
do_interrupt:
PUSHWORD di # push(cpu->pc)
movzx eax, byte ptr [ebp+10]
or eax, 0x20 # uint8_t temp = cpu->p | 0x20;
PUSH_BYTE al # push(temp);
popw ax
movw di, [esi+eax] # cpu->pc=*(uint16_t*)&(cpu->mem[0xfffe]);
or byte ptr [ebp+10], 4 # cpu->p |= FLAG_I;
movw [ebp+4],di # Remove when C-only
movb [ebp+9],ch # Remove when C-only
pop eax
add eax,7 # c += 7;
push eax
becomes
do_interrupt:
PUSHWORD di # push(cpu->pc)
movzx eax, byte ptr [ebp+10]
or eax, 0x20 # uint8_t temp = cpu->p | 0x20;
PUSH_BYTE al # push(temp);
pop ax
mov di, [esi+eax] # cpu->pc=*(uint16_t*)&(cpu->mem[0xfffe]);
or byte ptr [ebp+10], 4 # cpu->p |= FLAG_I;
mov [ebp+4],di # Remove when C-only
mov [ebp+9],ch # Remove when C-only
# pop eax; add eax,7 ; push eax # optimize into one instruction:
add dword ptr [esp], 7 # c += 7;
# or address it relative to EBP if we know where ESP is relative to EBP
Obviously you'll need to look at the macro defs too.
This doesn't look like the most efficient code ever; could do more in registers. But that's beside the point. I only saw one small peephole optimization of pop/add/push into a memory-destination add, didn't try to optimize the rest.
There's other obvious stuff like
movb %dl, [%ebp+7] # dl = r8 = X
movb %dh, [%ebp+8] # dh = r9 = Y
which could be a single word load into DX = DH:DL (x86 is little-endian and has very efficient unaligned loads, if this happens to be unaligned).
So I wouldn't recommend using this code as an example to learn x86!