Is there a portable way of generating a nop opcode in C/C++?

Question

In assembly, there is the nop opcode. Let's say I wanted to have that opcode in the generated assembly using C/C++ syntax alone. Is there any portable way of accomplishing that?

After reading this, I tried to compile (void)0; in the latest MSVC 15.5.2 with no success in generating a nop instruction, with or without optimizations enabled.

Since inline assembly is not part of the ISO C standard and its support in C++ is conditional, I would like to avoid that.

In Microsoft Visual Studio, I know I can use the __nop() function defined in the header <intrin.h> as documented here to successfully create a single nop in the generated assembly.

`nop` opcodes are platform-specific, and aren't even guaranteed to exist on every processor with a C++ compiler, so how could there be a portable way to generate it? — Ben Voigt, Jan 14 '18 at 02:05
The only 'portable' way would be to have a lot of `#if`s with various ways of doing it. — HolyBlackCat, Jan 14 '18 at 02:06
@HolyBlackCat Can you list the `#if`s for the more common platforms? — Nighteen, Jan 14 '18 at 02:07
there's no "nop" at the high level and typically you don't need to care about that. Compilers will automatically introduce NOPs for alignment or pipeline execution if necessary — phuclv, Jan 14 '18 at 02:36
It's not supported by the standard, but some OS kernels declare a delay loop with a volatile index variable as a way of telling the compiler not to optimize the loop away. — Davislor, Jan 14 '18 at 03:23

score 6 · Accepted Answer · answered Jan 14 '18 at 02:08

6

No, there is no portable way.

While compilers generally provide quite a few intrinsics for more or less exotic situations, and often also some way to write assembler-code inline, neither of those is standardised.

And any non-protected source-level-construct which clearly corresponds to "no-operation" won't stand a chance against the most basic optimization.

answered Jan 14 '18 at 02:08

Deduplicator

44,692
7
66
118

Good point about compiler optimizations, as whatever the OP is planning to do probably depends on the instruction remaining in the compiled executable. – Silvio Mayolo Jan 14 '18 at 02:19

score 2 · Answer 2 · answered Jan 14 '18 at 03:20

No and it doesn't make sense to have a nop operation on the high level that maps to an assembly nop. Why? Because of the as if rule which states that the compiler is "required to emulate (only) the observable behavior". By definition a nop operation doesn't have any kind of observable behavior (on the "the abstract machine" as defined in the standard).

Going to more practical aspects, the assembly instructions generated from a C++ source file don't have a 1 to 1 relation to the C++ instructions. Most often a C++ instruction is composed by multiple assembly instructions, multiple C++ instructions are replaced by one assembly instruction or even some C++ instructions don't have any kind of assembly counter-part due algorithm transformations or dead code elimination and on top of that instructions are rearranged and algorithms transformed all over the place.

For instance

One of the most radical example I can think of algorithm transformation a compiler can do is converting a recursive function to a simple iterative loop:

auto sum(int* v, int len)
{
    if (len == 0)
        return 0;

    return v[0] + sum(v + 1, len - 1);
}

A simple recursive function to compute the sum of elements of a vector.

This is what clang generates with -O1:

sum(int*, int): # @sum(int*, int)
  xor eax, eax
  test esi, esi
  je .LBB0_2
.LBB0_1: # =>This Inner Loop Header: Depth=1
  add eax, dword ptr [rdi]
  add rdi, 4
  add esi, -1
  jne .LBB0_1
.LBB0_2:
  ret

A clever algorithm transformation from recursion to a loop. Where would a nop C++ instruction fit in the assembly?

How about in the -O3 generated assembly?:

sum(int*, int): # @sum(int*, int)
  test esi, esi
  je .LBB0_1
  lea edx, [rsi - 1]
  add rdx, 1
  xor eax, eax
  cmp rdx, 8
  jae .LBB0_4
  mov rcx, rdi
  jmp .LBB0_7
.LBB0_1:
  xor eax, eax
  ret
.LBB0_4:
  mov r8d, esi
  and r8d, 7
  sub rdx, r8
  sub esi, edx
  lea rcx, [rdi + 4*rdx]
  add rdi, 16
  pxor xmm0, xmm0
  pxor xmm1, xmm1
.LBB0_5: # =>This Inner Loop Header: Depth=1
  movdqu xmm2, xmmword ptr [rdi - 16]
  paddd xmm0, xmm2
  movdqu xmm2, xmmword ptr [rdi]
  paddd xmm1, xmm2
  add rdi, 32
  add rdx, -8
  jne .LBB0_5
  paddd xmm1, xmm0
  pshufd xmm0, xmm1, 78 # xmm0 = xmm1[2,3,0,1]
  paddd xmm0, xmm1
  pshufd xmm1, xmm0, 229 # xmm1 = xmm0[1,1,2,3]
  paddd xmm1, xmm0
  movd eax, xmm1
  test r8d, r8d
  je .LBB0_8
.LBB0_7: # =>This Inner Loop Header: Depth=1
  add eax, dword ptr [rcx]
  add rcx, 4
  add esi, -1
  jne .LBB0_7
.LBB0_8:
  ret

Here the compiler does some loop unrolling. The loop is now transformed into a header where it gets all elements up to a multiple, a main body where aligned groups of elements are added simultaneously using vectorisation, and a tail where it gets the rest of the elements that could not fill an aligned group. And ... the funny thing ... the original C++ source didn't even have a loop. So if you had a nop in the source code where would you put it in the assembly? It has no observable effect so there is no reasoning that the compiler can leverage to figure out where to put it in this heavily transformed code.

And even if you thought of some clever rules and managed to specify where the C++ nop would map to assembly how useful would that be? An assembly nop has purposes that do not pertain to the program algorithm, but to the architecture implementation details, like RAW dependencies and so on. With C++ you don't model architecture details (at least not in standard C++), but architecture agnostic algorithms, so you can't have an instructions that is exclusively related to architecture detail.

The thing is that C++ does have a nop instruction. It is the empty statement ;;. And it can useful on a C++ syntax and semantics level:

// find the end of the string:

for (const ch* end = str; *ch != '\0'; ++end)
    ; // <-- empty statement. A nop instruction.

But because of the aforementioned rule and resonings it doesn't make sense to generate an assembly nop instruction.

score 0 · Answer 3 · answered Jan 14 '18 at 02:33

No, the majority of modern compilers with automatically go through your code and rearrange it in an attempt to optimize for your selected architecture. While rearranging your, any code that does not have a later dependency with be eliminated by the compiler (in Visual Studio, you can often observe the IDE underlining variables that are no used at later point; those are eliminated during compile time). If you want to maintain a nop, you are going to have to stick with a specific compiler.

Is there a portable way of generating a nop opcode in C/C++?

3 Answers3