6

There is this post, which has recently received some remarkable bunch of upvotes, asking about the + operator in C.
It shows the following implementation:

// replaces the + operator
int add(int x, int y) {
    while(x) {
        int t = (x & y) <<1;
        y ^= x;
        x = t;
    }
    return y;
}

Coincidentally, I wrote an implementation myself too (an algorithm book exercise) and came up with that:

uint32_t bit_add(uint16_t a, uint16_t b) {
    uint32_t carry = ((uint32_t) a & b) << 1;
    uint16_t add = a ^ b;

    return carry ^ add;
}

I tested it a couple of times and it seems to work. Thing is, it is significantly faster than the implementation from the referenced post, lacking any jumps on an x86.

Is my implementation correct or is there something wrong I am not aware of?
I cannot imagine my code to be faster than the code of a post so often seen and reviewed.

Community
  • 1
  • 1
cadaniluk
  • 15,027
  • 2
  • 39
  • 67
  • I didn't check, but it might be correct. Don't assume people write the most efficient code possible all the time; After all these answers were mostly created for toy problems or for demonstration purposes, not for actual use (+ is still faster). – Cubic Feb 29 '16 at 13:12
  • The two examples are differents, the first operates one bit per instruction in loop, in yours all bits are affected. The compiler may also have optimized your code. – Mathieu Feb 29 '16 at 13:13
  • Try adding 3 and 7, it outputs 2. – Kenney Feb 29 '16 at 13:15
  • 4
    This algorithm is the **software implementation of the hardware adder logic**, where the `xor` is used to add single bits and the `and` to extract carry to add over next bit. As such absolutely useless and much more slower that the hardware `add` instruction. Your code fails because, respect the original solution, the carry shift is not correctly performed. See https://en.wikipedia.org/wiki/Adder_(electronics) – Frankie_C Feb 29 '16 at 13:18
  • @purplepsycho: No, the first one also operates on all bits concurrently. The loop isn't for the bits, but to to work out the case when the applying the carry shift creates new carry shifts. – Erich Kitzmueller Feb 29 '16 at 13:23
  • Why don't you just test it? `for(uint16_t i = 0; i <= 65535; ++i) { for(uint16_t j = 0; j <= 65535; ++j) { assert(bit_add(i, j) == (i + j)); } }` – nhgrif Feb 29 '16 at 13:25
  • @Frankie_C Just to make sure we're on the same page: it's just an exercise for writing (actually just pseudocode) for an algorithm. I am fully aware it's slower. – cadaniluk Feb 29 '16 at 13:25
  • @nhgrif Well, to my embarassment, I didn't think of that in the first place. ^^ I just wanted a proof and testing didn't come to my mind. – cadaniluk Feb 29 '16 at 13:27
  • @cad Absolutely. But I added the comment for the sake of completeness, and for those who just read it over. And If I'm not wrong in the original post is said that the CPU internally doesn't work this way, that is wrong. This is exactly the transform of hardware design in a sequential logic, for my past memories, and as also reported from the wikipedia link. – Frankie_C Feb 29 '16 at 13:30
  • 1
    The original code invokes undefined behaviour for certain values of `x` and `y`. – too honest for this site Feb 29 '16 at 13:34
  • The first counterexample is `1+3` Each bit that you add has three inputs: the two bits and carry. The carry input depends on the carry result of the previous bits addition. You can't add without either a loop or as many steps as there are bits in the words you're adding. It is possible to optimize the loop so that the additions are not done one bit at a time, but it must be there. – Art Feb 29 '16 at 14:23

1 Answers1

6

Your function doesn't work.

A simple counterexample is 127 + 1.

This is easy to see. Number 127 has all lest significant 7 bits sets to 1. Anding it with the number 1, and shifting it one to the left, will give the value 2. From then on, using the operator xor, no combination of values we have available, can produce a value that is larger than 127.

2501
  • 25,460
  • 4
  • 47
  • 87
  • Crap, you're right. Could you explain why? I'm already about to start debugging but an explanation would make for an even finer answer, IMO... – cadaniluk Feb 29 '16 at 13:15
  • @cad you compute your `add` and `carry` correctly but in the `return` statement you made a mistake! that is where you need the **Loop** – Lrrr Feb 29 '16 at 13:17