2

Essentially, I want to execute a "NOP" in assembly. I know tricks like (void)0 exist, but from what I understand, these compile to literally nothing. What I want is something that will waste exactly one clock cycle when compiled. Is there a standard way to do such a thing?

Tim Morris
  • 385
  • 1
  • 3
  • 11
  • 4
    No, there is nothing for this. There is nothing in the C standard, of course. But even with extensions and assembly language, there is no good way to do this even within certain architectures. On x86 architectures, one processor might “execute” one no-op per cycle while another might execute four and other might remove a no-op from its execution stream before it reaches the execute phase. A decrement-and-branch might be more resistant; since it is dependent on its own counter result, most processors will be constrained from executing it more than once per cycle, but even that could fail. – Eric Postpischil Aug 19 '19 at 00:41
  • 1
    I'm curious, what's the purpose of doing this (spending one clock cycle)? Maybe some kind of sensitive cryptography where you're trying to prevent timing attacks? But even in that case, it would probably make more sense to look for an instruction that can occupy a specified number of cycles, or really, one that would just be equivalent to another instruction. – David Z Aug 19 '19 at 00:45
  • @DavidZ I'm bit banging and I want to have a unit to precisely control delays between data out and CLK up as much as I can. – Tim Morris Aug 19 '19 at 00:47
  • @TimMorris I think the most stable instruction to use as reference in it is the NOP. You could time how much NOP's are needed to correctly delay your operations between data out and CLK? – andresantacruz Aug 19 '19 at 00:51
  • 1
    @dedecos: No-op instructions are not generally guaranteed to take a certain number of CPU cycles. See my note above. Some of the Intel processors scan their incoming instruction streams and remove some no-ops—but only some, in certain conditions. And then how they dispatch and execute any others depends on other instructions in flight. The same sequence of instructions may take different numbers of CPU cycles depending on what happened just before they start. You might get 4 out of 6 no-ops executing one time and 6 out of 6 another time. – Eric Postpischil Aug 19 '19 at 00:55
  • Hmm very interesting @EricPostpischil I did not know that. – andresantacruz Aug 19 '19 at 00:56
  • Why don't you directly write the assembly NOP in your C program? Take a look [here](https://stackoverflow.com/questions/61341/is-there-a-way-to-insert-assembly-code-into-c) on how to embed assembly code directly in your C program. – Edward Aung Aug 19 '19 at 00:41
  • A no-op instruction is generally not guaranteed to consume one CPU cycle. – Eric Postpischil Aug 19 '19 at 00:42
  • Nifty. Even with Eric's comment, I am sure I can make use of this. – Tim Morris Aug 19 '19 at 01:03
  • 1
    @TimMorris: Modern CPUs are variable speed; where actual speed depends on things like temperature and thermal throttling, what work other logical processors in the same core are doing, what work other cores in the chip are doing (e.g. "turbo-boost"), and what the OS felt like doing (e.g. downclocking CPU when only low priority threads want CPU time); and `NOP` is typically discarded at front-end/decode and not executed and doesn't cost any cycles. All of this means that if you're relying on `NOP` for time delays then your code is guaranteed to be broken and wrong. You must use an actual timer. – Brendan Aug 19 '19 at 02:11
  • @TimMorris: Also; at any time hardware can send a "system management interrupt" (causing the CPU to stop executing your code and switch to "system management mode" for however long the firmware feels like). It is impossible for an OS to prevent this, predict when it will happen, or know how long it will take. In practice this means that time delays of less than a few microseconds are impossible for any software to guarantee. – Brendan Aug 19 '19 at 02:17
  • @TimMorris: In other words, the standard way to do what you want is to realize it's impossible. The closest sane alternative (for modern CPUs and not old CPUs) is a delay loop that polls `rdtsc` until the CPU's time stamp counter reaches an expiry time (but even that is potentially error prone and needs some major caution). – Brendan Aug 19 '19 at 02:20
  • @EricPostpischil: The purpose was stated in one of the OP's comments above ("I'm bit banging and I want to have a unit to precisely control delays between data out and CLK up as much as I can."). – Brendan Aug 24 '19 at 12:56
  • @Brendan: Ah, right you are, sorry. – Eric Postpischil Aug 24 '19 at 13:12

1 Answers1

2

Depending on compiler-specifics you can create a naked function that executes a NOP in an asm inline block of code.

In VC++, that would be something like this:

__declspec(naked) void foo()
{
   __asm NOP;
}

Just a side-note: you can use the _emit keyword too. In IA-32 doing that would look something like this:

__asm
{
    _emit 0x90; // This will effectively creates a NOP instruction.
}

An alternative in GCC and Clang would be:

asm volatile("nop");

Remember that naked functions don't have any sort of prologue or epilogue.

As well noted by Eric Postpischil in comments, there is no guarantee at all that a NOP instruction will spend exactly 1 cycle to execute.

S.S. Anne
  • 15,171
  • 8
  • 38
  • 76
andresantacruz
  • 1,676
  • 10
  • 17