0

FIRSTLY, this question is NOT about for loop performance. That is just the background.

So, I somehow found out that when using Java, counting down the for loops is much faster than counting up the loop. I mean, for(int i=0; i < n; i++) is way slower than for(int i=n-1; i >=0; i++). (Yes, I do know that premature optimization is the root of all evil, but I just wanted to find out the reason). So, this led me to think that the difference is because of how cmp is implemented in machine-level language. One of the ways I could think of regarding writing compare function natively is like this:

public int compare(int a, int b) {
  int diff = a-b;
  if(diff == 0) {
    return 0;
  }
  if(b ==0) {
    return((is MSD of a 1)?-1:1);
  }
  return(diff,0);
} 

And I can check the MSD bit by right-shifting the number by bit-machine size and see if it's a 1 or 0. BUT, even for this, I'd be needing ==. That will again come back to the same problem. So, my question is, at the assembly or machine level, how are <,>,== implemented just using bitwise operations and maybe jmp sequences?

Anoop Dixith
  • 633
  • 2
  • 8
  • 20
  • 2
    What benchmarks did you use to conclude that counting down was faster than counting up? – Louis Wasserman Jan 24 '15 at 08:25
  • possible duplicate of [Is it faster to count down than it is to count up?](http://stackoverflow.com/questions/2823043/is-it-faster-to-count-down-than-it-is-to-count-up) – Hoopje Jan 24 '15 at 08:34
  • Assuming that your benchmarks are correct; the difference isn't really in the comparison (which is just a subtraction that throws away the result), but more likely due to the fact that many processors have conditional jump instructions for the case where the result of a previous operation became zero or negative. – Michael Jan 24 '15 at 08:36
  • @LouisWasserman No, that's not the question at all. That was the prelude. Question is about implementation of cmp. It's really frustrating me that I couldn't figure out an iterative compare function! – Anoop Dixith Jan 24 '15 at 08:36
  • @Michael I agree. But when you say "which is just a subtraction that throws away the result", how does that result (difference) get returned? Baiscally, my question is - how does one implement int compare(int a, int b) without using <,> or == (because we don't have a 'compare' yet to use <,>,==). Am I making it clear? – Anoop Dixith Jan 24 '15 at 08:44
  • 1
    The difference doesn't get returned. What gets "returned" is typically that one or more flags in the processor's status register gets updated. If you want to know more about the actual comparison at the hardware level you could look at the answers for http://cs.stackexchange.com/questions/7074/how-does-the-computer-determine-whether-a-number-is-smaller-or-greater-than-anot – Michael Jan 24 '15 at 08:46
  • 1
    Usually, you subtract x-y and then look in the status register; zero flag = equal, negative flag = y > x, otherwise x> y – drRobertz Jan 24 '15 at 08:51
  • @Michael Thanks for that link. I got what I wanted from the discussion in that thread. Essentially this part: Since the numbers are binary, the digits are either 0 or 1 and the boolean function for equality of any two digits a and b can be expressed as `x = a*b + ~a*~b` – Anoop Dixith Jan 24 '15 at 08:52
  • 1
    @Srivatskrishnan you could do that, but it would be less useful than a comparison based on subtraction, since the comparison based on subtraction can also output additional flags that can be used to determine <, <=, > and >=. Comparison based on the horizontal AND over `~a ^ b` (which you wrote in a roundabout way) can only give you == and !=. Binary subtraction can be implemented using a circuit of logarithmic depth, see [kogge-stone adder](http://en.wikipedia.org/wiki/Kogge%E2%80%93Stone_adder) for example (`a - b = ~(~a + b)` so subtraction and addition are essentially the same thing) – harold Jan 24 '15 at 09:47

2 Answers2

2

for(int i=0; i < n; i++) is way slower than for(int i=n-1; i >=0; i++)

No, it's not. It's only slower before it gets optimized by the JITC. Write a Caliper or JMH benchmark to see this (I did some time ago). Otherwise you most probably get a complete non-sense (yes, Java benchmarking is really hard).

So, this led me to think that the difference is because of how cmp is implemented in machine-level language.

There's no cmp in the latter loop. It looks just like (Java-like syntax)

i--;
if (!zero_flag) goto loop_start;

which is where the performance win comes from.

One of the ways I could think of regarding writing compare function natively is like this

No, there's nothing like this. On most CPUs there's a cmp instruction working just like sub, but writing the difference nowhere. It only sets the flags (zero, carry, negative, ...) and this is exactly what the following conditional jump uses.

All meaningful flag combinations are implemented, i.e., you can do any of conditionals a < b, a <= b, .... using a single instruction.

how are <,>,== implemented just using bitwise operations and maybe jmp sequences?

Not at all. There's no bit-fiddling there.

maaartinus
  • 44,714
  • 32
  • 161
  • 320
0

At machine level, knowing if a result becomes zero, positive or negative is a "free operation". After almost every arithmetic operation, the compiler reflects this in "flags", on which branching instructions are available. Therefore, you often try to arrange your loops to completely avoid a compare instruction (even though it is fast) and rely on the fast that the loop is finished when a certain register reaches zero through a DECrement operation. An optimizing compiler should do that for you, however so it is a little strange that this should be noticeable.

Dan Byström
  • 9,067
  • 5
  • 38
  • 68
  • An optimizing compiler will do that for you, *if* the direction the loop runs has no impact on the loop result, e.g, there are no loop-carried data dependences. (OP didn't show his actual loop.) – Ira Baxter Jan 25 '15 at 11:43