Is a "single cycle cpu" possible if asynchronous components are used?

Question

I have heard the term "Single Cycle Cpu" and was trying to understand what single cycle cpu actually meant. Is there a clear agreed definition and consensus and what is means?

Some home brew "single cycle cpu's" I've come across seem to use both the rising and the falling edges of the clock to complete a single instruction. Typically, the rising edge acts as fetch/decode and the falling edge as execute.

However, in my reading I came across the reasonable point made here ...
https://zipcpu.com/blog/2017/08/21/rules-for-newbies.html

"Do not transition on any negative (falling) edges. 
Falling edge clocks should be considered a violation of the one clock principle, 
as they act like separate clocks.".

This rings true to me.

Needing both the rising and falling edges (or high and low phases) is effectively the same as needing the rising edge of two cycles of a single clock that's running twice as fast; and this would be a "two cycle" CPU wouldn't it.

So is it honest to state that a design is a "single cycle CPU" when both the rising and falling edges are actively used for state change?

It would seem that a true single cycle cpu must perform all state changing operations on a single clock edge of a single clock cycle.

I can imagine such a thing is possible providing the data strorage is all synchronous. If we have a synchronous system that has settled then on the next clock edge we can clock the results into a synchronous data store and simultaneously clock the program counter on to the next address.

But if the target data store is for example async RAM then the surely control lines would be changing whilst that data is being stored leading to unintended behaviours.

Am I wrong, are there any examples of a "single cycle cpu" that include async storage in the mix?

It would seem that using async RAM in ones design means one must use at least two logical clock cycles to achive the state change.

Of course, with some more complexity one could perhaps add anhave a cpu that uses a single edge where instructions use solely synchronout components, but relies on an extra cycle when storing to async data; but then that still wouldn't be a single cycle cpu, but rather a a mostly single cycle cpu.

So no CPU that writes to async RAM (or other async component) can honestly be considered a single cycle CPU because the entire instruction cannot be carried out on a single clock edge. The RAM write needs two edges (ie falling and rising) and this breaks the single clock principal.

So is there a commonly accepted single cycle CPU and are we applying the term consistently?

What's the story?

(Also posted in my hackday log https://hackaday.io/project/166922-spam-1-8-bit-cpu/log/181036-single-cycle-cpu-confusion and also on a private group in hackaday)

=====

Update: Looking at simple MIP's it seems the models use synchronous memory and so can probably operate off a single edge ad maybe it does - therefore warrant the category "single cycle". And perhaps FPGA memory is always synchronous - I don't know about that.

But is the term using inconsistently elsewhere - ie like most Homebrew TTL Computers out there??

Or am I just plain wrong?

====

Update :

Some may have misunderstood my point.

Numerous home brew TTL cpu's claim "single cycle CPU" status (not interested for the purposes of this discussion in more complex beasts that do pipelining or whatever).

By single cycle these CPU's they typically mean that they do something like advancing the PC on one edge of the clock and then the use the opposing edge of the clock to update flipflops with the result. OR they will use the other phase of the clock to update async components like latches and sram.

However, the ZipCPU reference I provided suggests that using the opposing clock edge is akin to using a second clock cycle or even a second clock. BTW Ben Eater in his vids even compares the inverted clock that he uses to update his SRAM to being a second clock.

My objection to the use of "single cycle CPU" with such CPU's (basically most/all home bred TTL CPU's I've seen as they all work that way) is that I agree with ZipCPU that using the opposing edge (or phase) of the clock for the commit is effectively the same as using a second clock and this makes a mockery of the "single cycle" claim.

If the use of oposing edge is effectively the same a using a single edge but of dual clock cycles then I think that makes use of the term questionable. So I take ZipCPU's point to heart and tighten the term to mean use of a single edge.

On the other hand is seems perfectly possible to build a CPU that uses only sync components (ie edge triggered flip flops) and which uses only a single edge, where on each edge, we clock whatever is on the bus into whatever device is selected for write and at the same moment advance the PC. Between one edge and the next same direction edge, settling occurs.

In this manner we end up with CPI=1 and use of only a single edge - which is very distinctly different to the common TTL CPU pattern of using both edges of the clock.

BTW my impression of FPGA's (which I'm not referring to here) is that the storage elements in FPGA are all synchronous flip flops. I don't know, but that's what my reading suggests. Anyway, if this is true then a trivial FPGA based CPU probabnly has a CPI=1 and uses only say the +ve edge and so these might well meet my narrow definition of "single cycle cpu". Also, my reading suggests that various MIP's impls (educational efforts probably) are probably meeting my definition ok.

The logic *starts* on a clock edge, but ripples forward through gate-delays over time at a speed that depends on electrical details of each gate, and wire delays. And finishes before the next clock cycle starts, even in the worst case (the critical path). Your phrasing of "*must perform all state changing operations on a single clock edge of a single clock cycle.*" doesn't account for gate-delay, but your very next paragraph does talk about it ("settling") so clearly I'm just nit-picking. Perhaps "start" instead of "perform"? I'm not an expert on logic terminology so maybe that's fine. — Peter Cordes, Jan 12 '21 at 03:20
Anyway, the only way I could see to get any kind of guaranteed stable interval without a clock edge is to control gate delays, like intentionally use a known long chain of gate delays to trigger the start of a memory-write, with that chain being longer than the critical path of the data inputs. But that seems super-flaky because gate-delay isn't constant, and one side of the chip running hotter than the other could change relative timing. I don't know if this is an answer, but yeah your concern seems valid. — Peter Cordes, Jan 12 '21 at 03:25
NB I said a "single clock" - edge not "without a clock edge". Using gate delays to recreat a write pulse for an async component like sram is no better than having a separate clock to do that or using more than one clock cycle to achieve the write. These are examples of why I don't accept the use of the term "single cycle cpu" for the home brew CPU's out there; because it seems inevitable that if one uses async components (as opposed to flipflow and sync memory) then there is no way to do useful work within a single clock cycle and using only a single edge. — johnlon, Jan 12 '21 at 17:29
Right, of course you'd just use another clock edge in a real-world design, unless you were intentionally trying to maintain the philosophical purity of being a "single cycle" CPU design but still drive a signal onto another clocked bus. I'm not super familiar with what makes DRAM "async", though; couldn't you start the RAM-write (by asserting some pin or sending a memory-clock edge) whenever the data was ready, and end the write on the next proper CPU clock? As long as you keep the gate delays leading to RAM low enough, you can make sure the memory "clock" interval is long enough. — Peter Cordes, Jan 12 '21 at 17:40
I am building an 8 Bit TTL CPU at the moment and will for the sake of arguments probably now follow it up with a trivial CPU that does CPI=1 and a single edge. This will mean no use of SRAM (or latches) of course severly limiting it's capability but it would be a vehicle to make the point about the loose use of "single cycle" where these are realy multicycle in sheep clothing. - https://hackaday.io/project/166922-spam-1-8-bit-cpu - https://github.com/Johnlon/spam-1/ — johnlon, Jan 12 '21 at 17:55
Ok, but the primary point of your question seems to be terminology, not actually building stuff. Pipelined designs sometimes use the other clock edge to get something done in a half cycle in one of the pipeline stages, but we still say its a 100MHz CPU, not 200, for example. e.g. MIPS I (R2000) handled branches this way: [How does MIPS I handle branching on the previous ALU instruction without stalling?](https://stackoverflow.com/a/58601958). — Peter Cordes, Jan 12 '21 at 18:02
Only if you always fully use both clock edges (like P4 / netburst's double-pumped ALUs) does that part of the CPU get described at twice the frequency. That would require defining a "single cycle" CPU as using two half-cycles. If you're willing to allow that, then all is fine. Otherwise yeah, using more than a single clock edge would mean that "single cycle" comes with an asterisk. — Peter Cordes, Jan 12 '21 at 18:04
A single cycle CPU can fully perform any instruction from fetch to commit in a single clock cycle. It's fine if the total work is divided in two half cycles; the CPU is still single-cycle. So the total work does not have to be carried out on a single edge because a cycle contains two edges, not just one, and it's OK to partition the work on the two edges. Now if you double the frequency for the same single cycle CPU, then the definition of what a cycle is changes and it'd be no longer a single cycle CPU. — Hadi Brais, Jan 13 '21 at 03:55
Its just semantics: without further qualification, the term "single cycle" is ambiguous as to whether one edge or both edges (transitions of the clock) are used. — Erik Eidt, Jan 13 '21 at 18:11

score 2 · Answer 1 · answered Jan 18 '21 at 21:12

This seems mostly a question of definitions and terminology, moreso than how to actually build simple CPUs.

If you insist on that strict definition of "single cycle CPU" meaning to truly use only clock edge to set everything in motion for that instruction, then yes, that would exclude real-world toy/hobby CPUs that use a 2nd clock edge to give a consistent interval for memory access.

But they certainly still fulfil the spirit of a single-cycle CPU, which is that every instruction runs in 1 clock cycle, with no pipelining and no multi-cycle microcode.

A whole clock cycle does have 2 clock edges, and it's normal for real-world (non-single-cycle) CPUs to use the "other" edge for internal timing in some cases, but we still talk about their frequency in whole cycles, not the edge frequency, except in cases like DDR memory where we do talk about the transfer rate = twice the memory clock frequency. What sets that apart is always using both edges, and for approximately equal things, not just some extra timing / synchronization within a clock cycle.

Now could you build a CPU that keeps a store value on a memory bus for some minimum time, without using a clock edge? Maybe.

Perhaps make sure the critical path leading to store-data it is short enough that the data is always ready. And possibly propagate some "data-ready" signal along with your computations (or just from the longest critical path of any instruction), and after a couple gate delays after the data is on the bus, flip the memory clock. (And on the next CPU clock edge, flip it back). If your memory doesn't mind its clock not having a uniform duty cycle, this might be fine as long as each half of the memory clock is long enough.

For loading from memory, you can maybe do something similar by initiating a memory load cycle some gate-delays after the CPU clock edge that starts this "cycle" of your single-cycle CPU. This might even involve building a long chain of gate delays intentionally with inverters dedicated to that purpose. Or perhaps even an analog RC time delay, but either way that's obviously worse than just using the other edge of the main clock, and you'd only ever do this as an exercise in single-cycle dogmatic purity. (It can also be flaky because gate-delay isn't constant, depending on voltage and temperature, so one side of the chip running hotter than the other could change relative timing.)

IDK if this is a useful answer, but I didn't want your bounty to end without you getting something out of it. So I revised my comments into this answer. Again, IDK if that's the kind of answer you wanted. But TL:DR: don't worry about this. — Peter Cordes, Jan 18 '21 at 21:14

SmilingMouse · Answer 2 · 2021-01-18T18:55:31.833

The definition says that a single cycle CPU takes just one instruction per one cycle. So it's possible to make a conclusion in theory that there are other CPU's that takes more or less instruction per cycle. You can check it out that there are some concepts like multi-cycle processor and pipelined processor (Instruction pipelining). "Pipelining attempts to keep every part of the processor busy with some instruction by dividing incoming instructions into a series of sequential steps." acorkcording to Wiki. I don't know how exactly it works, but maybe it just uses available registers (maybe instead of using for example EAX, ECX is used like EAX or maybe it works some other way, but one for sure is 100% true that number of registers in increasing, so maybe it's one of main purposes. Source: https://en.wikipedia.org/wiki/Instruction_pipelining I think the answer for the question: "is-a-single-cycle-cpu-possible-if-asynchronous-components-are-used" depends on CPU controller that controls both CPU and RAM with opcodes. I found interesing information about this on site: http://people.cs.pitt.edu/~cho/cs1541/current/handouts/lect-pipe_4up.pdf https://ibb.co/tKy6sR2 CONCLUSION: In my opinion, if we consider the term "single cycle CPU" it should be the simplest possible construction. The term "asynchronous" implements a conclusion, that is more complex than "synchronous". So both terms are not equivalent. It's something like "Can a basic data type be considered as a structure?". In my opinion the word "single" means the simplest possible and "asynchronous" means some modification, so more complex, so just think it's not possible, but maybe the term "are used" can be bypassed by "are used at the time" - if some switch, some controller can turn off asynchronous mode and make this all the simplest possible. But generally just think it's not possible

The OP is using "asynchronous" to talk about RAM that has a clocked interface, as opposed to just putting an address in and data coming out after some gate delays (like SRAM). IDK if asynchronous is even the right term for it, but your analogy of "Can a basic data type be considered as a structure?" does not seem useful at all. It's not about making the whole CPU unclocked. — Peter Cordes, Jan 18 '21 at 21:17
Also, the entire first half about the existence of pipelined and microcoded CPUs is not relevant. It's like answering a question about designing motorcycles by pointing out that cars and trains exist. But if you're curious, see [Modern Microprocessors A 90-Minute Guide!](http://www.lighterra.com/papers/modernmicroprocessors/), and https://www.realworldtech.com/sandy-bridge/ for a deep-dive into a complex modern pipeline (superscalar / out-of-order, with a large physical register file to rename the architectural registers onto) — Peter Cordes, Jan 18 '21 at 21:19
Wow cool - just have some simple understanding of advanced structures - added this answer because wanted to start discussion about CPU architecture - it's very interesting for me. I also look for a relevant topic about mapping hardware with unknown behaviour - like mapping the behaviour of microcontrollers with erased name. Generally - one big question - how do we know if documentation given by Intel is 100% true about all CPU behaviours? Maybe the meaning of opcodes is different that it's described - just need someone to talk about it like irrational opcodes like mov edi, edi — SmilingMouse, Jan 18 '21 at 21:33
`mov edi,edi` is a perfectly normal way to zero-extend EDI into RDI. But for actual searching for undocumented opcodes, see [Breaking the x86 instruction set (conference presentation vid)](https://www.youtube.com/watch?v=KrksBdWcZgQ) / https://news.ycombinator.com/item?id=15209632. Basically, put bytes at the end of an unmapped page and see whether you get a `#UD` illegal instruction fault or a #PF page fault from trying to fetch into the next page after it executes. Or if those bytes are only part of a longer instruction. — Peter Cordes, Jan 18 '21 at 23:21

Is a "single cycle cpu" possible if asynchronous components are used?

2 Answers2

Linked