Microcode terminology: are there names for different "styles" of microcode?

Question

I've been looking at microcode and wondered about terminology.

The "classic" use of microcode is to replace the processor control logic with microcode to generate the processor control signals. But there are some systems that go much further and implement low-level parts of the operating system in microcode, most famously the Xerox Alto, but also systems like the Datapoint 6600 and to a smaller extent the IBM 360. In these systems, executing instructions is just one task for the microcode, rather than the point of the microcode. Is there a word for this style of microcode? "Microprogrammed" almost fits, but is used for microcode programming in general.

The second dimension I'm wondering about: in some systems the microarchitecture is pretty much the same as the programmer-level architecture, maybe with a few extra internal registers, for example, the 68000. But in other systems, the visible architecture is essentially unrecognizable in the microarchitecture. For example, the different IBM 360 models have completely different microarchitectures but identical programmer-level architectures. My second question is if there is a term to describe systems where the microarchitecture is completely different from the visible architecture?

(I know about vertical vs. horizontal microcode but this is different. Also, the example I use are old, but this isn't a retrocomputing question.)

Also relevant: the modern meaning of "microcode", where most instructions decode to a single internal operation and execute directly, but a few instructions are microcoded. Especially modern x86, where e.g. a memory-destination add isn't truly microcoded, but decodes as separate load/add and store uops. But Intel's implementation of `idiv` is like 10 uops from the microcode sequencer. [What is a microcoded instruction?](//stackoverflow.com/q/40366643). And `rep movs` loops internally. And of course instructions like `syscall` are microcoded as many uops. — Peter Cordes, Jul 15 '19 at 02:22
@PeterCordes What about https://en.wikipedia.org/wiki/AES_instruction_set#x86_architecture_processors That would be an example of what he is talking about, right? — Jerry Jeremiah, Jul 15 '19 at 02:53
@JerryJeremiah: `PCLMULQDQ` was microcoded in early CPUs that supported it (e.g. 18 uops on Sandybridge: https://agner.org/optimize/ instruction tables), just to get compatibility out there and still gain some speedup. Once software using it started to become more widespread, and transistor budgets climbed, Intel added more specialized hardware that could run it as a 3 uops on Haswell (which is <=4 so not "microcoded" in that sense), probably with a 64-bit wide execution unit that needed to run it in 2 halves. Broadwell decodes it to a single uop, so they widened the execution unit. — Peter Cordes, Jul 15 '19 at 02:59
@JerryJeremiah: The AESENC / AESDEC instructions have never been heavily microcoded; they only perform one round of AES per instruction so it's left up to software to loop over rounds, not to microcode. It would defeat the performance purpose if they were heavily microcoded. The key-setup instructions don't have to be fast and *are* microcoded. e.g. Sandybridge AESENC/DEC is 2 uops, AESKEYGENASSIST is 11. Haswell/Skylake AESENC/DEC is 1 uop while AESKEYGENASSIST is still 10 or 13 uops. Skylake improved the latency to benefit code that doesn't interleave work on multiple streams. — Peter Cordes, Jul 15 '19 at 03:04
@JerryJeremiah: If you mean it as "visible architecture different from internal architecture", again no. That's just the implementation of single instructions. The registers they operate on are still the architectural x86 SIMD registers. (Which are renamed onto a physical register file internally (or onto the ROB in P6-family), but that's done by fixed hardware not microcode to be able to rename 4 registers per clock.) Emulating a single complex instruction with microcode that may use tmp regs is like what the OP is talking about for 68000. That ucode can be replaced by fixed-function HW. — Peter Cordes, Jul 15 '19 at 03:11
@PeterCordes Thank you. I have wondered about that since I saw that they added those instructions. — Jerry Jeremiah, Jul 15 '19 at 03:16

Olsonist · Answer 1 · 2020-04-24T05:21:29.040

Maurice Wilkes' original microcode paper doesn't mention horizontal vs vertical. But according to this taxonomy,

a horizontal microinstruction controls multiple resources in one cycle
a vertical microinstruction controls a single resource

There are other microcode features such as writeable; these don't change the microinstruction encoding.

Horizontal vs vertical microcode is a spectrum rather than a dichotomy. A strictly horizontal microinstruction would consist solely of control bits and fields. Such a pure horizontal microinstruction for any real architecture would be very wide since there are a lot of functions to control in a complex processor. Moreover, these control bits would be quite sparse. The resulting microstore would be large and expensive and not necessarily fast.

Instead modern microarchitectures like the P6 have opcodes. An opcode decoder is a combinational circuit which takes opcode bits and emits control values. This costs some gate delay but provides significant width compression, allowing a much smaller microstore. A vertical microarchitecture simply takes this to an extreme and each opcode controls a single resource.

Writing complex instructions and low level OS components in microcode was actually efficient in the 60s and that led to CISC ISAs. However, when VLSI, caches and superscalars came along, this design decision was revisited which gave rise to RISC ISAs. But again, this historical progression of ISAs doesn't change the taxonomy of microcode.

Microcode terminology: are there names for different "styles" of microcode?

1 Answers1