Complexity for >=, <= vs <, >, == operation in C/C++

Question

In arithmetic operation, add and subtract operations takes less complexity while division has high complexity. Does all the relational operations have the same complexity or differs for <= and >= operations.

What complexity? All of these operations (including division) *might* be implemented as a single CPU instruction. — Eugene Sh., Mar 07 '18 at 17:31
related/dupe: https://stackoverflow.com/questions/12135518/is-faster-than — NathanOliver, Mar 07 '18 at 17:33
For primitive (built-in) types like `int` or `double`, you can normally expect these to all be equal in speed. For user-defined types, it'll depend on how they're overloaded. — Jerry Coffin, Mar 07 '18 at 17:35
@EugeneSh.: the significant difference is not in number of CPU instructions, but in number of CPU clock cycles, where divide, mod and to a less extent multiplication are often much, much slower (for e.g. `int`, `double`) than comparisons, addition, subtraction, and bitwise operations (all the latter often taking one clock cycle). — Tony Delroy, Mar 07 '18 at 17:35
@TonyDelroy addition and subtraction are slower than bitwise operations, but you put them together, so it is still not clear what OP and you mean by complexity. — Slava, Mar 07 '18 at 17:38
@Slava: *"addition and subtraction are slower than bitwise operations"* - there's not much point gainsaying me (or anyone) without evidence or references. Mine: [here](http://www.agner.org/optimize/instruction_tables.pdf) — Tony Delroy, Mar 07 '18 at 17:41
@TonyDelroy if you ever deal with FPGA you would clearly know why summation is slower. Your reference is irrelevant here, nobody said this question about Intel CPU or compatible ones. — Slava, Mar 07 '18 at 17:45
On many CPUs compare is implemented very similar to subtract, and would have the same complexity. — Bo Persson, Mar 07 '18 at 17:55
@Slava: I just used x86 as a counter example to your blanket claim that "addition and subtraction are slower". My own claim - by which I stand - was that they all are *"often taking one clock cycle"* - note "often". That's true on *many* CPUs. I not only understand why the *circuitry* for addition/summation is more complicated (FPGAs or not), but also - more importantly - the reasons why clock speeds on CPUs aren't increased to try to exploit the faster performance of the simpler logic needed for bitwise operations vs `+` and `-`. — Tony Delroy, Mar 07 '18 at 17:56
Almost every compiler is going to turn `(x < 100)` and `(x <= 99)` into exactly the same assembly code. So no, there's no difference in complexity in that regard. That said, modern optimizing compilers will do much more complex optimizations, many of which are obviously compiler-specific. — Dan Korn, Mar 07 '18 at 18:08
@Slava: He doesn't specify the CPU involved, but he does specify C and C++, which (at least mostly) restricts you to CPUs. Even though we all know that addition (for example) uses a carry-chain, so in theory it should be slower than a bit-wise operation, on essentially any and all reasonably modern CPUs, addition and subtraction are single-cycle operations. — Jerry Coffin, Mar 07 '18 at 18:14
@JerryCoffin you know that for all modern CPUs including embedded? wow. How about CPU that implemented in FPGA? — Slava, Mar 07 '18 at 18:42
@Slava: I'll repeat: [tag:c++] and [tag:c++14]. So even if we assume there's some 8-bit microcontroller that needs two cycles for an addition, it's irrelevant to the question at hand. Likewise something I decide to implement in an FPGA, for which no C++ compiler exists. — Jerry Coffin, Mar 07 '18 at 19:25
@JerryCoffin why so, C++ and C++14 is restricted to be used on 8-bit microcontroller? Plus compiled C++ or C++14 code can be executed on software simulator of some CPU, is that allowed? — Slava, Mar 07 '18 at 19:28
@Slava: If you want to write (for example) an LLVM back-end to target a 6502, feel free--but right now, I'm reasonably certain it just doesn't exist (and the chances of somebody investing enough to create one are minuscule). — Jerry Coffin, Mar 07 '18 at 19:31
@JerryCoffin I do not have to write back-end for LLVM - cfront and existence of C compiler is enough. — Slava, Mar 07 '18 at 19:33
@Slava: No, it's not. cfront development stopped long before C++14, or even C++98. What it accepts no longer qualifies as C++ at all, not to mention being even close to C++14. — Jerry Coffin, Mar 07 '18 at 19:38
@JerryCoffin how about Comeau? It definitely supported C++98 and some C++11 — Slava, Mar 07 '18 at 19:45
@Slava: it did. But it didn't support any arbitrary C compiler as the backend--if you wanted to support some 8-bit CPU, you needed to get Comeau to do that port for you (and it's been quite a while since I saw anything from Greg Comeau to indicate that he was actively engaged in doing such things any more). Theoretically, somebody else could do the same: license the EDG front end, and use it to target a C compiler of their choice. Then you have an even harder step though: find an *actual* 8-bit CPU that executes bitwise operators faster than addition/subtraction. — Jerry Coffin, Mar 07 '18 at 19:49
Even if I can't say with *absolute* authority that it couldn't possibly exist, I can say that I've used most of the common ones (Intel, Motorola, MOS Tech, NEC, National Semi, Dallas Semi, Microchip, Atmel, etc.) and I'm pretty sure it's not true of any of them. Bottom line: yes, it's difficult to prove a negative--but in this case, there's a huge preponderance of evidence indicating that it just doesn't exist. — Jerry Coffin, Mar 07 '18 at 19:58

Tony Delroy · Answer 1 · 2018-03-07T18:12:40.357

Does all the relational operations have the same complexity or differs for <= and >= operations.

The performance of these (for inbuilt integral and floating point types) is not determined by the C++ language Standard, but the CPU instructions the specific compiler implementation emits, which is of course limited by what that CPU offers. You'll could research the compiler(s) you're interested in (e.g. g++ -S program.cc will produce program.s with assembly), then research the CPU models you're interested in and their performance. For x86-family processors you'll find lists of instruction timings easily online (e.g. here), and - as with most CPUs - all the comparison operations tend to take one CPU cycle. It's vaguely possible that some more complicated instruction like a conditional move or jump might support a more limited set of conditions on some CPU model, but that's unlikely to be a practical issue for you.

In addition to this, these operators can apply to classes and structs which means they're not always predictable in terms of performance. — tadman, Mar 07 '18 at 17:58

score 2 · Answer 2 · answered Mar 07 '18 at 18:07

Computational complexity is discussed in at least two contexts, theoretical and practical.

In a theoretical context, the input to a program is a string of n bits, and we study how long a theoretical computer must take to produce an answer as a function of n. In such a computer, the time it takes to compare two binary numerals is O(n), because you may have to compare each of the bits of the two numbers up to the end before you find a difference. This is true regardless of whether the operation is less than, less than or equal to, greater than, greater than or equal to, equal to, or not equal to. As soon as you find the most significant bit that differs between the two numbers, you can determine any of those order relations immediately. It is finding the most significant bit that differs (or determining all bits are the same) that takes the varying amount of time, and that is O(n).

In a practical context, computers have fixed-size words, and we are concerned with how long practical programs take given that they stay within the bounds of those fixed-size words. In common current general-purpose processors, comparing two scalar numbers (numbers represented in one word or less, such as 64-bit integers or narrower or 64-bit floating-point values) typically has a fixed latency, often a single CPU cycle, sometimes a few cycles. Effectively, the complexity is O(1). Sometimes the latency may be fixed for normal cases but take longer for special cases such as subnormals or NaNs. The latency generally does not differ for which relation is being evaluated. In fact, many processors have a single general comparison instruction that returns all the ordering information, and a second information is used to branch based on whether the order is less than, greater than, equal to, or unordered (NaNs are not ordered).

If a compiler or library implements an arbitrary size number, such as multi-precision floating-point or big integers, then it will behave more like the theoretical problem: Numbers with more digits will take longer to compare. In normal representations of numbers, all comparisons will have the same complexity. However, there are special representations in which the comparisons may differ. For example, we could represent an integer by storing its residues modulo selected divisors instead of storing bits representing specific magnitudes. It would be easy to compare two such integers for equality (Are all their residues the same?), but it could be difficult to compare them for order (which one is larger is a complicated function of their residues).

Complexity for >=, <= vs <, >, == operation in C/C++

2 Answers2