92

I was browsing some C++ code, and found something like this:

(a + (b & 255)) & 255

The double AND annoyed me, so I thought of:

(a + b) & 255

(a and b are 32-bit unsigned integers)

I quickly wrote a test script (JS) to confirm my theory:

for (var i = 0; i < 100; i++) {
    var a = Math.ceil(Math.random() * 0xFFFF),
        b = Math.ceil(Math.random() * 0xFFFF);

    var expr1 = (a + (b & 255)) & 255,
        expr2 = (a + b) & 255;

    if (expr1 != expr2) {
        console.log("Numbers " + a + " and " + b + " mismatch!");
        break;
    }
}

While the script confirmed my hypothesis (both operations are equal), I still don't trust it, because 1) random and 2) I'm not a mathematician, I have no idea what am I doing.

Also, sorry for the Lisp-y title. Feel free to edit it.

arhak
  • 2,488
  • 1
  • 24
  • 38
Martin
  • 1,065
  • 8
  • 13
  • You seem to have already shown that they are the same. Are you just looking for a mathematical proof of why they are the same? – DBPriGuy Nov 22 '16 at 21:13
  • 1
    .. and bitwise operators have no place in floating point arithmetic. What you have shown is not C. – Weather Vane Nov 22 '16 at 21:13
  • @DBPriGuy Yes - I'd like to be sure that my assumption is true, mathematically, before using it in code. – Martin Nov 22 '16 at 21:14
  • 4
    What language is that script? Does `Math.random()` return an integer or a double on [0,1)? I don't think your script (best that I can tell) reflects the problem that you posed at all. – Brick Nov 22 '16 at 21:15
  • @WeatherVane How do you suggest I do integer math in JS for my tests, then? I thought [`Math.ceil` returns an integer](https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_objects/Math/ceil) – Martin Nov 22 '16 at 21:16
  • @Brick JavaScript. – Martin Nov 22 '16 at 21:16
  • @WeatherVane Yes I have: I explicitly said `(a and b are 32-bit unsigned integers)` – Martin Nov 22 '16 at 21:18
  • 1
    Also, as an aside, if this is JavaScript, why is this question tagged with `c` and `c++`, but not `javascript`? – eddiem Nov 22 '16 at 21:18
  • @Martin ok will delete that. – Weather Vane Nov 22 '16 at 21:18
  • @eddiem Because my original code is C/C++ - but I wrote the test in JS, because it's faster and easier than doing it in a separate `.cpp` file and compiling. – Martin Nov 22 '16 at 21:18
  • 7
    What is c/c++ code? They are different langauges. – Weather Vane Nov 22 '16 at 21:19
  • @WeatherVane It is mostly C, compiled with a C++ compiler. I think that's irrelevant, I'll edit it out. – Martin Nov 22 '16 at 21:19
  • 14
    You cannot reproduce the behavior that you're trying to test in JS. That's why everyone is only you about the language choice. JS is not strongly typed and the answer depends critically on the type of the variables in C/C++. The JS is complete nonsense given the question that you've asked. – Brick Nov 22 '16 at 21:21
  • 1
    @WeatherVane While they're different in many ways, isn't the treatment of bit operations on unsigned integers essentially the same for C and C++? Is there anything in his question that would depend on which language he's using? – Barmar Nov 22 '16 at 22:30
  • @Barmar the OP is using `var a` and `Math.ceil` and `Math.random` which are not C functions. Are you saying there should be no language tags at all? – Weather Vane Nov 22 '16 at 22:33
  • 4
    @WeatherVane That's essentialy pseudo-code, using the Javascript function names. His question is about the behavior of `&` and `+` on unsigned integers in C and C++. – Barmar Nov 22 '16 at 22:36
  • Barmar, then please restore the C tag, add a few more. What exactly is your troll about? – Weather Vane Nov 22 '16 at 22:38
  • Your JS code is not really helpful. It needs to be `Math.floor(Math.random()*0x100000000)` to get the desired uint32 range. But using JS isn't useful anyway, as its bitwise `&` operator only works with *signed* integers (and casts anything to that), so it works quite different from the C/C++ operator. – Bergi Nov 23 '16 at 00:11
  • 11
    Keep in mind that "I wrote a test program and got the answer I expected for all possible inputs" isn't actually a guarantee that something behaves how you expect. Undefined behavior can be nasty like that; only giving unexpected results after you're done convincing yourself that your code is right. –  Nov 23 '16 at 01:13
  • Have you actually checked the speed and assembly output to determine whether your variation improves performance? – Jack Aidley Nov 23 '16 at 11:54
  • @Hurkyl Though I could only upvote your comment once, I pushed the trackpad button *really hard*. – HostileFork says dont trust SE Nov 23 '16 at 12:29
  • 1
    @Hurkyl That's exactly why I came to ask instead of just blindly doing it and screwing something up. – Martin Nov 23 '16 at 14:40
  • 2
    @JackAidley It's not about performance (the code is not performance critical), but rather, about correctness. – Martin Nov 23 '16 at 14:41
  • How many 32-bit integers are there? 2^32 = 4294967296. Actually, since you have two variables, it is the square of that. How many integers have you tried with your code? 100, which is essentially 0% of the total amount of combinations. Indeed, you shouldn't trust your results, the sample size is too small. Sure, you don't need to test all of them (it would take forever), but your sample size must be larger, or at least it should include integers that might trigger corner cases (e.g. integers that might overflow). – Denilson Sá Maia Nov 23 '16 at 19:55

9 Answers9

78

They are the same. Here's a proof:

First note the identity (A + B) mod C = (A mod C + B mod C) mod C

Let's restate the problem by regarding a & 255 as standing in for a % 256. This is true since a is unsigned.

So (a + (b & 255)) & 255 is (a + (b % 256)) % 256

This is the same as (a % 256 + b % 256 % 256) % 256 (I've applied the identity stated above: note that mod and % are equivalent for unsigned types.)

This simplifies to (a % 256 + b % 256) % 256 which becomes (a + b) % 256 (reapplying the identity). You can then put the bitwise operator back to give

(a + b) & 255

completing the proof.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • 1
    Very elegant proof. – Enzo Ferber Nov 22 '16 at 21:27
  • 81
    It is mathematical proof, ignoring possibility of overflow. Consider `A=0xFFFFFFFF, B=1, C=3`. The first identity does not hold. (Overflow is not going to be a problem for unsigned arithmetic, but it is a bit different thing.) – AlexD Nov 22 '16 at 21:39
  • 4
    Actually, `(a + (b & 255)) & 255` is the same as `(a + (b % 256)) % N % 256`, where `N` is one larger than the maximum unsigned value. (the latter formula is meant to be interpreted as arithmetic of mathematical integers) –  Nov 23 '16 at 01:15
  • 2
    @AlexD The trick of using AND as modulo only works at all when C is a power of 2. And you will find that for valid values of C, the identity holds even in the case of an overflow. – nitro2k01 Nov 23 '16 at 03:04
  • 17
    Mathematical proofs such as this are not appropriate for proving the behaviour of integers on computer architectures. – Jack Aidley Nov 23 '16 at 11:52
  • 26
    @JackAidley: They are appropriate *when done correctly* (which is one is not, due to neglecting to consider overflow). –  Nov 23 '16 at 13:09
  • 1
    The devil is in the detail, even with unsigned arithmetic, I found a corner case where the Standard allows for it to fail. Your hypotheses do not apply to C arithmetics. – chqrlie Nov 23 '16 at 14:37
  • 1
    @Hurkyl Overflow doesn't need to be considered in this case. `a` and `b` will never be larger than `0xFFFF`. So `a + b` will never be larger than `0x1FFFE`. OP stated that these integers are unsigned 32-bit integers, so overflow is impossible. – Shaz Nov 23 '16 at 15:03
  • 3
    @Shaz: That's true of the test script, but not part of the question asked. –  Nov 23 '16 at 15:05
  • @Hurkyl: There's a reason I said "such as this" :) – Jack Aidley Nov 23 '16 at 15:06
  • 1
    @Shaz: the script in JS is irrelevant to the question: the OP saw the expression in some C++ code... the types are 32 bit unsigned, but no assumption can be made about the values. My answer is very much language-lawyer oriented, in real life, the addition cannot overflow, but the math proof does not apply. – chqrlie Nov 23 '16 at 15:25
  • 3
    -1 because the proof is wrong. It could be written correctly by taking into account that `a+b` is computed modulo 232. – R.. GitHub STOP HELPING ICE Nov 24 '16 at 02:09
21

In positional addition, subtraction and multiplication of unsigned numbers to produce unsigned results, more significant digits of the input don't affect less-significant digits of the result. This applies to binary arithmetic as much as it does to decimal arithmetic. It also applies to "twos complement" signed arithmetic, but not to sign-magnitude signed arithmetic.

However we have to be careful when taking rules from binary arithmetic and applying them to C (I beleive C++ has the same rules as C on this stuff but i'm not 100% sure) because C arithmetic has some arcane rules that can trip us up. Unsigned arithmetic in C follows simple binary wraparound rules but signed arithmetic overflow is undefined behaviour. Worse under some circumstances C will automatically "promote" an unsigned type to (signed) int.

Undefined behaviour in C can be especially insiduous. A dumb compiler (or a compiler on a low optimisation level) is likely to do what you expect based on your understanding of binary arithmetic while an optimising compiler may break your code in strange ways.


So getting back to the formula in the question the equivilence depends on the operand types.

If they are unsigned integers whose size is greater than or equal to the size of int then the overflow behaviour of the addition operator is well-defined as simple binary wraparound. Whether or not we mask off the high 24 bits of one operand before the addition operation has no impact on the low bits of the result.

If they are unsigned integers whose size is less than int then they will be promoted to (signed) int. Overflow of signed integers is undefined behaviour but at least on every platform I have encountered the difference in size between different integer types is large enough that a single addition of two promoted values will not cause overflow. So again we can fall back to the simply binary arithmetic argument to deem the statements equivalent.

If they are signed integers whose size is less than int then again overflow can't happen and on twos-complement implementations we can rely on the standard binary arithmetic argument to say they are equivilent. On sign-magnitude or ones complement implementations they would not be equivilent.

OTOH if a and b were signed integers whose size was greater than or equal to the size of int then even on twos complement implementations there are cases where one statement would be well-defined while the other would be undefined behaviour.

plugwash
  • 9,724
  • 2
  • 38
  • 51
21

Lemma: a & 255 == a % 256 for unsigned a.

Unsigned a can be rewritten as m * 0x100 + b some unsigned m,b, 0 <= b < 0xff, 0 <= m <= 0xffffff. It follows from both definitions that a & 255 == b == a % 256.

Additionally, we need:

  • the distributive property: (a + b) mod n = [(a mod n) + (b mod n)] mod n
  • the definition of unsigned addition, mathematically: (a + b) ==> (a + b) % (2 ^ 32)

Thus:

(a + (b & 255)) & 255 = ((a + (b & 255)) % (2^32)) & 255      // def'n of addition
                      = ((a + (b % 256)) % (2^32)) % 256      // lemma
                      = (a + (b % 256)) % 256                 // because 256 divides (2^32)
                      = ((a % 256) + (b % 256 % 256)) % 256   // Distributive
                      = ((a % 256) + (b % 256)) % 256         // a mod n mod n = a mod n
                      = (a + b) % 256                         // Distributive again
                      = (a + b) & 255                         // lemma

So yes, it is true. For 32-bit unsigned integers.


What about other integer types?

  • For 64-bit unsigned integers, all of the above applies just as well, just substituting 2^64 for 2^32.
  • For 8- and 16-bit unsigned integers, addition involves promotion to int. This int will definitely neither overflow or be negative in any of these operations, so they all remain valid.
  • For signed integers, if either a+b or a+(b&255) overflow, it's undefined behavior. So the equality can't hold — there are cases where (a+b)&255 is undefined behavior but (a+(b&255))&255 isn't.
Bathsheba
  • 231,907
  • 34
  • 361
  • 483
Barry
  • 286,269
  • 29
  • 621
  • 977
17

Yes, (a + b) & 255 is fine.

Remember addition in school? You add numbers digit by digit, and add a carry value to the next column of digits. There is no way for a later (more significant) column of digits to influence an already processed column. Because of this, it does not make a difference if you zero-out the digits only in the result, or also first in an argument.


The above is not always true, the C++ standard allows an implementation that would break this.

Such a Deathstation 9000 :-) would have to use a 33-bit int, if the OP meant unsigned short with "32-bit unsigned integers". If unsigned int was meant, the DS9K would have to use a 32-bit int, and a 32-bit unsigned int with a padding bit. (The unsigned integers are required to have the same size as their signed counterparts as per §3.9.1/3, and padding bits are allowed in §3.9.1/1.) Other combinations of sizes and padding bits would work too.

As far as I can tell, this is the only way to break it, because:

  • The integer representation must use a "purely binary" encoding scheme (§3.9.1/7 and the footnote), all bits except padding bits and the sign bit must contribute a value of 2n
  • int promotion is allowed only if int can represent all the values of the source type (§4.5/1), so int must have at least 32 bits contributing to the value, plus a sign bit.
  • the int can not have more value bits (not counting the sign bit) than 32, because else an addition can not overflow.
Community
  • 1
  • 1
alain
  • 11,939
  • 2
  • 31
  • 51
  • 2
    There are many other operations besides addition where garbage in the high bits doesn't affect the result in the low bits you're interested in. See [this Q&A about 2's complement](http://stackoverflow.com/questions/34377711/which-2s-complement-integer-operations-can-be-used-without-zeroing-high-bits-in), which uses x86 asm as the use-case, but also applies to unsigned binary integers in any situation. – Peter Cordes Nov 23 '16 at 04:10
  • 2
    While it is of course everybody's right to downvote anonymously, I always appreciate a comment as an opportunity to learn. – alain Nov 23 '16 at 09:29
  • 2
    This is by far the easiest answer / argument to understand, IMO. The carry/borrow in addition/subtraction propagates only from low bits to high bits (right to left) in binary, the same as in decimal. IDK why someone would downvote this. – Peter Cordes Nov 23 '16 at 10:11
  • @PeterCordes Thanks! Since two other answers have been downvoted too, I suspect the downvoter would like to read something about int propagation in these answers too, like in plugwash's answer, which is the only upvoted and non-downvoted answer. But this is all speculation, that's why a comment would be nice. – alain Nov 23 '16 at 10:21
  • I concur. IMHO this answer is the best so far in its simplicity. Have an upvote! But do note that it's vulnerable to overflow on a platform where the maximum value of the type of `a + b` is *not* one less than a multiple of 256. Unlikely, but theoretically possible (I think). Perhaps that's why someone downvoted, but downvoting on this level of pedantry is reprehensible. – Bathsheba Nov 23 '16 at 10:22
  • @Bathsheba: oh, you mean if `a` and `b` are `unsigned short` and get promoted to `int`, where `int` is weird? But yeah, some mention of how this ties back to being safe *in C* might be a good edit to the answer. – Peter Cordes Nov 23 '16 at 10:23
  • No, I mean something like UINT_MAX = 65537, for example. I've never come across a system where 1 + the maximum unsigned is not a multiple of 256 (you'd "waste" bits, surely?) but I don't think the standard disallows it. – Bathsheba Nov 23 '16 at 10:24
  • @Bathsheba yes, this could be a reason. (IIRC only `char` types have a guarantee that every possible object representation must be a valid, distinct value, which would make a weird `INT_MAX` impossible, if this guarantee was made for `int` too) – alain Nov 23 '16 at 10:30
  • 1
    @Bathsheba: CHAR_BIT is not required to be 8. But unsigned types in C and C++ are required to behave as normal base2 binary integers of some bit width. I think that requires that UINT_MAX is `2^N-1`. (N may not even be required to be a multiple of CHAR_BIT, I forget, but I'm pretty sure the standard requires that wraparound happens modulo some power of 2.) I think the only way you can get weirdness is via promotion to a signed type that's wide enough to hold `a` or `b` but not wide enough to hold `a+b` in all cases. – Peter Cordes Nov 23 '16 at 10:35
  • @PeterCordes: So long as N is an integer, where 1 + the maximum unsigned is will be a multiple of 256, so this answer stands. **Very good point about the modulo behaviour.** I think that nails it. – Bathsheba Nov 23 '16 at 10:36
  • @Bathsheba: Why are you bringing 256 into it? This answer works (in binary) if `unsigned` is a 19-bit type, or 123 bit, or anything >= 8 (and the standard requires >= 16, so we're fine). Zeroing all the high bits, above `[7:0]`, doesn't affect the contents of bits `[7:0]`, regardless of how many high bits there are. That is what `&255` or `%256` does, regardless of anything. – Peter Cordes Nov 23 '16 at 10:39
  • I was worried about a + b overflowing past a wierd maximum, which effectively changes the bit pattern of the last 8 bits. But you've reminded me that's not possible. (There appears to be a counter-example in the comments to my answer, but I haven't taken the time to analyse it yet.) – Bathsheba Nov 23 '16 at 10:40
  • 2
    @Bathsheba: yes, fortunately C-as-portable-assembly-language really does mostly work for unsigned types. Not even a purposely-hostile C implementation can break this. It's only signed types where things are horrible for truly portable bit-hacks in C, and a Deathstation 9000 can really break your code. – Peter Cordes Nov 23 '16 at 10:45
  • Ahh, work just interfered with my SO activity, I'll try to find the relevant standard quotes and update my answer if I have time. Thanks for your interesting comments, Peter Cordes and Bathsheba. – alain Nov 23 '16 at 11:32
  • @alain: see my answer, the Standard is full of surprises. – chqrlie Nov 23 '16 at 14:38
14

You already have the smart answer: unsigned arithmetic is modulo arithmetic and therefore the results will hold, you can prove it mathematically...


One cool thing about computers, though, is that computers are fast. Indeed, they are so fast that enumerating all valid combinations of 32 bits is possible in a reasonable amount of time (don't try with 64 bits).

So, in your case, I personally like to just throw it at a computer; it takes me less time to convince myself that the program is correct than it takes to convince myself than the mathematical proof is correct and that I didn't oversee a detail in the specification1:

#include <iostream>
#include <limits>

int main() {
    std::uint64_t const MAX = std::uint64_t(1) << 32;
    for (std::uint64_t i = 0; i < MAX; ++i) {
        for (std::uint64_t j = 0; j < MAX; ++j) {
            std::uint32_t const a = static_cast<std::uint32_t>(i);
            std::uint32_t const b = static_cast<std::uint32_t>(j);

            auto const champion = (a + (b & 255)) & 255;
            auto const challenger = (a + b) & 255;

            if (champion == challenger) { continue; }

            std::cout << "a: " << a << ", b: " << b << ", champion: " << champion << ", challenger: " << challenger << "\n";
            return 1;
        }
    }

    std::cout << "Equality holds\n";
    return 0;
}

This enumerates through all possible values of a and b in the 32-bits space and checks whether the equality holds, or not. If it does not, it prints the case which didn't work, which you can use as a sanity check.

And, according to Clang: Equality holds.

Furthermore, given that the arithmetic rules are bit-width agnostic (above int bit-width), this equality will hold for any unsigned integer type of 32 bits or more, including 64 bits and 128 bits.

Note: How can a compiler enumerates all 64-bits patterns in a reasonable time frame? It cannot. The loops were optimized out. Otherwise we would all have died before execution terminated.


I initially only proved it for 16-bits unsigned integers; unfortunately C++ is an insane language where small integers (smaller bitwidths than int) are first converted to int.

#include <iostream>

int main() {
    unsigned const MAX = 65536;
    for (unsigned i = 0; i < MAX; ++i) {
        for (unsigned j = 0; j < MAX; ++j) {
            std::uint16_t const a = static_cast<std::uint16_t>(i);
            std::uint16_t const b = static_cast<std::uint16_t>(j);

            auto const champion = (a + (b & 255)) & 255;
            auto const challenger = (a + b) & 255;

            if (champion == challenger) { continue; }

            std::cout << "a: " << a << ", b: " << b << ", champion: "
                      << champion << ", challenger: " << challenger << "\n";
            return 1;
        }
    }

    std::cout << "Equality holds\n";
    return 0;
}

And once again, according to Clang: Equality holds.

Well, there you go :)


1 Of course, if a program ever inadvertently triggers Undefined Behavior, it would not prove much.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • 1
    you say it is easy to do with 32-bit values but actually use 16-bit... :D – Willi Mentzel Nov 23 '16 at 12:42
  • @WilliMentzel: I am afraid you misconstrued the statement. I said it is easy to check all valid combinations of 32 bits. Checking all combinations of all valid values of 2x16-bits is checking all valid combinations of 32 bits; if I were using 32-bits integers, I would be checking all valid combinations of 64 bits, and unless optimized out the program would not stop before the computer's death (or reboot). – Matthieu M. Nov 23 '16 at 12:46
  • I see but then your program does not prove anything, since the question is about unsigned 32-bit integers for a and b each :( – Willi Mentzel Nov 23 '16 at 12:52
  • 1
    @WilliMentzel: That's an interesting remark. I initially wanted to say that if it works with 16 bits then it will work the same with 32 bits, 64 bits and 128 bits because the Standard does not have specific behavior for different bit-widths... however I remembered that it actually does for bit-widths smaller than that of `int`: small integers are first converted to `int` (a weird rule). So I actually have to do the demonstration with 32-bits (and afterwards it extends to 64 bits, 128 bits, ...). – Matthieu M. Nov 23 '16 at 14:18
  • 2
    Since you cannot evaluate all (4294967296 - 1) * (4294967296 - 1) possible outcomes, you reduce somehow? I my opinion MAX should be (4294967296 - 1) if you go that way but it will never finish within our lifetime like you said... so, after all we cannot show the equality in an experiment, at least not in one like you describe. – Willi Mentzel Nov 23 '16 at 15:32
  • @WilliMentzel: The idea was indeed to reduce it to a similar (but smaller) problem to make the computation finish in a reasonable time. However, it turns out that the compiler is so clever that it optimizes everything out and therefore the program runs (instantly) even with 32-bits unsigned integers. – Matthieu M. Nov 23 '16 at 15:36
  • this is really interesting, the compiler sees that champion and challenger are equal, and therefore has no reason to go through more than one iteration (if at all)... nice :) since the compiler knows what it is doing, one could except your program as a proof, but with MAX = 4294967296 - 1 which should not matter, but just in case :D +1, although you could shorten the answer drastically ;) – Willi Mentzel Nov 23 '16 at 15:41
  • 1
    Testing this on one 2's complement implementation doesn't prove that it's portable to sign-magnitude or one's-complement with Deathstation 9000 type widths. e.g. a narrow unsigned type could promote to a 17-bit `int` which can represent every possible `uint16_t`, but where `a+b` can overflow. That's only a problem for unsigned types narrower than `int`; [C requires that `unsigned` types are binary integers, so wraparound happens modulo a power of 2](http://stackoverflow.com/questions/40751662/is-a-b-255-255-the-same-as-a-b-255/40751779?noredirect=1#comment68748486_40751897) – Peter Cordes Nov 24 '16 at 00:19
  • @PeterCordes: True, the good thing is that you can easily test it on whatever architecture/implementation you wish and see whether it works for you (or not). – Matthieu M. Nov 24 '16 at 07:42
  • Sure, I guess if you put this test into the test-suite for your software package, people wondering why it's broken when compiled for a new architecture can find the answer quickly by seeing that this test failed. That's a lot less satisfactory than writing something that's actually strictly portable, so only worth it when you can't find a portable way to still write readable code that compiles to efficient asm on your platform. But tests don't always catch undefined behaviour that happens to work in the test but not with different context for the optimizer. – Peter Cordes Nov 24 '16 at 09:38
  • 1
    @PeterCordes: but at least if you have a test, it's still easier to find out (especially with instrumented builds `-fsanitize=undefined` coming to mind) than if you have a sketch of a mathematical proof on a napkin in the trash :) Also... portability is one thing, but sometimes there's no point. And let's face it, these days, architectures with non 8-bits bytes, non 1/2/4/8 bytes integers and non 2-complement representation are *exotic*. – Matthieu M. Nov 24 '16 at 09:46
  • 1
    Agreed about C being too portable for its own good. It would be *really* nice if they'd standardize on 2's complement, arithmetic right-shifts for signed, and a way to do signed arithmetic with wrapping semantics instead of undefined-behaviour semantics, for those cases when you *want* wrapping. Then C could once again be useful as a portable assembler, instead of a minefield thanks to modern optimizing compilers that make it unsafe to leave any undefined behaviour (at least for your target platform. Undefined behaviour only on Deathstation 9000 implementations is ok, as you point out). – Peter Cordes Nov 24 '16 at 10:23
  • If C doesn't do that, I hope Rust or something similar fills that void, since it has wrapping and overflow-checked functions for all the built-in integer types. (With the default semantics being C-like: compiler is allowed to assume no overflow unless you use `a.wrapping_add(b)`.) It also has nice stuff like saturating-add, popcount, and rotate for builtin types, which compile to whatever is best on the target platform. vs. C where you have to write something and hope compilers optimize it back to one instruction, because C can't express the operation you want, or use a non-portable intrinsic. – Peter Cordes Nov 24 '16 at 10:29
  • 1
    @PeterCordes: Agreed, Rust is such a breath of fresh air here. And yes [2-complement is guaranteed](https://doc.rust-lang.org/book/primitive-types.html#signed-and-unsigned). – Matthieu M. Nov 24 '16 at 10:46
5

The quick answer is: both expressions are equivalent

  • since a and b are 32-bit unsigned integers, the result is the same even in case of overflow. unsigned arithmetic guarantees this: a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.

The long answer is: there are no known platforms where these expressions would differ, but the Standard does not guarantee it, because of the rules of integral promotion.

  • If the type of a and b (unsigned 32 bit integers) has a higher rank than int, the computation is performed as unsigned, modulo 232, and it yields the same defined result for both expressions for all values of a and b.

  • Conversely, If the type of a and b is smaller than int, both are promoted to int and the computation is performed using signed arithmetic, where overflow invokes undefined behavior.

    • If int has at least 33 value bits, neither of the above expressions can overflow, so the result is perfectly defined and has the same value for both expressions.

    • If int has exactly 32 value bits, the computation can overflow for both expressions, for example values a=0xFFFFFFFF and b=1 would cause an overflow in both expressions. In order to avoid this, you would need to write ((a & 255) + (b & 255)) & 255.

  • The good news is there are no such platforms1.


1 More precisely, no such real platform exists, but one could configure a DS9K to exhibit such behavior and still conform to the C Standard.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • 3
    Your 2nd subbullet requires (1) `a` is smaller than `int` (2) `int` has 32 value bits (3) `a=0xFFFFFFFF`. Those can't all be true. – Barry Nov 23 '16 at 17:40
  • 2
    @Barry: The one case that seems to meet the requirements is 33-bit `int`, where there are 32 value bits and one sign bit. – Ben Voigt Jan 08 '18 at 01:01
2

Identical assuming no overflow. Neither version is truly immune to overflow but the double and version is more resistant to it. I am not aware of a system where an overflow in this case is a problem but I can see the author doing this in case there is one.

Loren Pechtel
  • 8,945
  • 3
  • 33
  • 45
  • 1
    The OP specified: *(a and b are 32-bit unsigned integers)*. Unless `int` is 33 bits wide, the result is the same **even** in case of overflow. unsigned arithmetic guarantees this: *a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.* – chqrlie Nov 23 '16 at 11:48
2

Yes you can prove it with arithmetic, but there is a more intuitive answer.

When adding, every bit only influences those more significant than itself; never those less significant.

Therefore, whatever you do to the higher bits before the addition won't change the result, as long as you only keep bits less significant than the lowest bit modified.

Francesco Dondi
  • 1,064
  • 9
  • 17
0

The proof is trivial and left as an exercise for the reader

But to actually legitimize this as an answer, your first line of code says take the last 8 bits of b** (all higher bits of b set to zero) and add this to a and then take only the last 8 bits of the result setting all higher bits to zero.

The second line says add a and b and take the last 8 bits with all higher bits zero.

Only the last 8 bits are significant in the result. Therefore only the last 8 bits are significant in the input(s).

** last 8 bits = 8 LSB

Also it is interesting to note that the output would be equivalent to

char a = something;
char b = something;
return (unsigned int)(a + b);

As above, only the 8 LSB are significant, but the result is an unsigned int with all other bits zero. The a + b will overflow, producing the expected result.

FreelanceConsultant
  • 13,167
  • 27
  • 115
  • 225