Is ARM’s RISC instruction set a subset of x86? If so, why can’t x86 run ARM software natively then?

Question

From my limited understanding of instruction sets, ARM is a RISC architecture, meaning that there are significantly fewer/simpler instructions than x86 based processors. If this is the case, I would expect ARM’s instruction set to be a subset of x86’s instructions, since I’ve also heard “x86 instructions can do all that ARM can do and more.”

If this really is the case, shouldn’t x86 be able to run ARM software natively, since x86 has all the instructions necessary?

Just look at the instruction sets. Pretty obvious they are not related other than each has the same basic functionality of processors that came before and after both x86 and arm. — old_timer, Aug 12 '20 at 21:29
I attempted to do a static binary translation between two instruction sets once (6502 to arm), which I kinda succeeded but it was the wrong path, a high level language that can have the dead code removed was the much much better path (for performance) emulation was the best path for ease of development, maintenance, etc. If your emulator is running fast enough to emulate the older/different system. Just in case you are thinking it is a simple manner of replacing one instruction with another to emulate some other system. — old_timer, Aug 12 '20 at 21:32
the R in RISC does not mean fewer just less complicated, I would expect there to be the same or more overall instructions in a RISC instruction set than a CISC. They are reduced in that for example you want to do an add you do it with registers or maybe an immediate if it fits in the instruction, memory operands not allowed. You want to use a memory operand then you load it into a register with a separate instruction that just loads registers from memory, want to save it back out, you use a store instruction that just stores from registers to memory. reduced complexity — old_timer, Aug 12 '20 at 21:38

Peter Cordes · Answer 1 · 2021-01-04T23:49:31.460

No, ARM and x86 have completely different machine-code formats, and aren't compatible at an asm source level either. Not at all.

1,2,3 is a smaller set than 11,12,13,14,15,16,17, but it's not a subset, so your reasoning by analogy doesn't hold up. Doing the same kinds of things in different ways means compiling high-level code for ARM is similar to compiling for x86, not that they're compatible.

x86 uses instructions that vary in length from 1 to 15 bytes. ARM uses fixed-length (4 byte), or in thumb mode either 2 or 4 byte instructions. And even if you happened to have a 2-byte x86 add eax, ecx vs. a Thumb mode ARM 2-byte adds r0, r1, the encoding would be different.

Also, think of RISC as more like Reduced Instruction-Set Complexity - Each instruction has to be simple enough to go through the pipeline and execute in one execution unit (e.g. either send it to the ALU, or a load or store unit), but there can be a lot of different possible instructions. Of course an ALU for multiply or divide can't finish in a single cycle so the execution unit might be pipelined.

And even then, ARM is not very RISCy; it deviates from RISC philosophy in ways that increase code density and performance, e.g. push {r4, r5, lr, pc} does 4 pushes, encoding which registers to push in a bitmap. This has to do to a variable number of internal operations in the store unit. Or pop multiple registers has to write a variable number of registers, as well as doing a variable number of loads. So it doesn't pipeline as easily (and was dropped for AArch64 in favour of load/store pair), but still not too bad especially for early simple ARM pipelines that the ISA was designed around.

Also, with NEON SIMD and various other instruction-set extensions, and compact Thumb2 encoding, ARM has a lot of instructions.

Also related: Could a processor be made that supports multiple ISAs? (ex: ARM + x86) for a more detailed exploration of why you couldn't just slap ARM and x86 front-ends in front of a generic back-end and get good performance from each. e.g. different memory-ordering rules, and different quirks of FLAGS handling, so you'd have to handle worst of both worlds in the back-end in every case where it mattered.

add r0,r1,r2 takes a number of internal operations as well, push isnt special. multiply is even worse than add or push in some of the implementations (As it takes many states which means it has broken the multiply up into parts to save chip real estate(nothing special with arm on that front either, multiply and divide always have this tradeoff when designing a cpu)). Normal logic implementation, nothing magic, arm, mips, risc-v, etc, etc... — old_timer, Aug 12 '20 at 21:22
risc definitely doesnt mean one hardware step, it does imply a pipeline which is multiple steps per instruction. pipeline depths vary per implementation of any cpu risc or cisc and what you do at each of those steps in the pipe. another trade off. mips has its own design decisions with its instruction set choices. — old_timer, Aug 12 '20 at 21:25
@old_timer: Perhaps "one hardware step" wasn't the best way to word what I meant. I *meant* to imply that internally the instruction only has to go through one execution unit, either an ALU or a load or store unit. Of course the instruction has to still go through the pipeline. Edited to clarify. — Peter Cordes, Aug 12 '20 at 21:40

score 2 · Accepted Answer · answered Aug 12 '20 at 20:32

No, its no. The ARM instruction set may be more limited compared to the x86 instruction set but this has nothing to do with the architecture of the processors. The ARM instruction set is not a subset of x86 instructions. They are encoded differently and the processor executes them in a different way. The registers are not the same and even the way the instruction pointer works is not the same.

So: The instructions are different on both architectures and just because ARM has fewer instructions does not mean that the instruction, an ARM processor supports, are a subset of the instructions that intel support. Also there is more to compatibility than instructions like encoding. You can neither run ARM software on x86 nor assemble arm Assembly to x86.

score 2 · Answer 3 · answered Aug 12 '20 at 21:18

They are in no way related any more than x86 and mips, mips and risc, pdp11 and x86, etc.

Processors are very dumb and do elementary operations, read/write (store/load) some alu functions, add, subtract, etc. some logical operations xor, or, and, etc.

CISC made sense at the time and memory was relatively much more expensive so instructions with more steps made sense too and at that point they didnt have a whole lot of experience with processors like we do now. As things have evolved, RISC makes a lot more sense, higher overall performance because less overhead (when compared apples to apples which we are still waiting on), less power, less logic to do the same tasks, etc. Larger binaries are likely, but since not every part of the cisc instruction was really needed there is a tradeoff. x86 being an 8 bit instruction set basically and arm being 32/16 right there binaries will be larger. So that is a trade off for size, but with the performance benefit of alignment.

It is kind of like saying that because a semi truck has an engine and tires and a seat then a honda civic is a subset of a semi truck because it has similar parts and can drive on the same roads.

The pdp11 could run unix and then much later the x86, does that mean the x86 is a subset of the pdp11? Nope Derived from the pdp11? Nope. True intel bought DEC eventually thats about it.

Is ARM’s RISC instruction set a subset of x86? If so, why can’t x86 run ARM software natively then?

3 Answers3