11

This is a complete rewrite of the question. Hopefully it is clearer now.

I want to implement in C a function that performs addition of signed ints with wrapping in case of overflow.

I want to target mainly the x86-64 architecture, but of course the more portable the implementation is the better. I'm also concerned mostly about producing decent assembly code through gcc, clang, icc, and whatever is used on Windows.

The goal is twofold:

  1. write correct C code that doesn't fall into the undefined behavior blackhole;
  2. write code that gets compiled to decent machine code.

By decent machine code I mean a single leal or a single addl instruction on machines which natively support the operation.

I'm able to satisfy either of the two requisites, but not both.

Attempt 1

The first implementation that comes to mind is

int add_wrap(int x, int y) {
    return (unsigned) x + (unsigned) y;
}

This seems to work with gcc, clang and icc. However, as far as I know, the C standard doesn't specify the cast from unsigned int to signed int, leaving freedom to the implementations (see also here).

Otherwise, if the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

I believe most (all?) major compilers do the expected conversion from unsigned to int, meaning that they take the correct representative modulus 2^N, where N is the number of bits, but it's not mandated by the standard so it cannot be relied upon (stupid C standard hits again). Also, while this is the simplest thing to do on two's complement machines, it is impossible on ones' complement machines, because there is a class which is not representable: 2^(N/2).

Attempt 2

According to the clang docs, one can use __builtin_add_overflow like this

int add_wrap(int x, int y) {
    int res;
    __builtin_add_overflow(x, y, &res);
    return res;
}

and this should do the trick with clang, because the docs clearly say

If possible, the result will be equal to mathematically-correct result and the builtin will return 0. Otherwise, the builtin will return 1 and the result will be equal to the unique value that is equivalent to the mathematically-correct result modulo two raised to the k power, where k is the number of bits in the result type.

The problem is that in the GCC docs they say

These built-in functions promote the first two operands into infinite precision signed type and perform addition on those promoted operands. The result is then cast to the type the third pointer argument points to and stored there.

As far as I know, casting from long int to int is implementation specific, so I don't see any guarantee that this will result in the wrapping behavior.

As you can see [here][godbolt], GCC will also generate the expected code, but I wanted to be sure that this is not by chance ans is indeed part of the specification of __builtin_add_overflow.

icc also seems to produce something reasonable.

This produces decent assembly, but relies on intrinsics, so it's not really standard compliant C.

Attempt 3

Follow the suggestions of those pedantic guys from SEI CERT C Coding Standard.

In their CERT INT32-C recommendation they explain how to check in advance for potential overflow. Here is what comes out following their advice:

#include <limits.h>

int add_wrap(int x, int y) {
    if ((x > 0) && (y > INT_MAX - x))
        return (x + INT_MIN) + (y + INT_MIN);
    else if ((x < 0) && (y < INT_MIN - x))
        return (x - INT_MIN) + (y - INT_MIN);
    else
        return x + y;
}

The code performs the correct checks and compiles to leal with gcc, but not with clang or icc.

The whole CERT INT32-C recommendation is complete garbage, because it tries to transform C into a "safe" language by forcing the programmers to perform checks that should be part of the definition of the language in the first place. And in doing so it forces also the programmer to write code which the compiler can no longer optimize, so what is the reason to use C anymore?!

Edit

The contrast is between compatibility and decency of the assembly generated.

For instance, with both gcc and clang the two following functions which are supposed to do the same get compiled to different assembly. f is bad in both cases, g is good in both cases (addl+jo or addl+cmovnol). I don't know if jo is better than cmovnol, but the function g is consistently better than f.

#include <limits.h>

signed int f(signed int si_a, signed int si_b) {
  signed int sum;
  if (((si_b > 0) && (si_a > (INT_MAX - si_b))) ||
      ((si_b < 0) && (si_a < (INT_MIN - si_b)))) {
    return 0;
  } else {
    return si_a + si_b;
  }
}

signed int g(signed int si_a, signed int si_b) {
  signed int sum;
  if (__builtin_add_overflow(si_a, si_b, &sum)) {
    return 0;
  } else {
    return sum;
  }
}
Federico
  • 582
  • 3
  • 12
  • *As far as I know, casting from `long int` to `int` is implementation specific* The documentation you quote does not say addition is done in `long int`; it says *infinite precision signed type*. – trent Dec 12 '19 at 16:57
  • @trentcl Yes, which is even more ambiguous. What I meant is that since `long int` to `int` is unspecified in the standard, I don't know what to think of _infinite precision_ to `int`... – Federico Dec 12 '19 at 17:03
  • You're right, it does not say that wrapping arithmetic is used, which I would assume means it is not guaranteed. Clang targets LLVM, which does normally wrap, so presumably it's easy for them to make this extra guarantee. – trent Dec 12 '19 at 17:12
  • Your frustrated commentary about "garbage" doesn't help. C is designed to do things simple and fast and translate as directly to machine code as possible -- most additions are done with values qhere overflow is not an issue. However, if you want to do additions with stricter requirements, say for safety critical software, there is no way to get around more complicated code, whether using C or using assembly directly. – Dúthomhas Jan 20 '22 at 00:14
  • On 32-bit platforms, the conversion from `long int` to `int` is a no-op (also Win64, IIRC). On 64-bit platforms other than Win64 it is not a no-op. – Jonathan Leffler Jan 20 '22 at 00:31
  • This question appears to provide an answer to itself, and then rant about how the answer is garbage... "The whole CERT INT32-C recommendation is complete garbage, because it tries to transform C into a "safe" language by forcing the programmers to perform checks that should be part of the definition of the language in the first place. And in doing so it forces also the programmer to write code which the compiler can no longer optimize, so what is the reason to use C anymore?!" Correct, removing undefined behaviour can also affect the program. I suppose you'd rather your program be unsafe ‍♂️ – autistic Jul 23 '22 at 23:14

4 Answers4

2

A bit like @Andrew's answer without the memcpy().

Use a union to negate the need for memcpy(). With C2x, we are sure that int is 2's compliment.

int add_wrap(int x, int y) {
  union {
    unsigned un;
    int in;
  } u = {.un = (unsigned) x + (unsigned) y};
  return u.in;
}

For those who like 1-liners, use a compound literal.

int add_wrap2(int x, int y) {
  return ( union { unsigned un; int in; }) {.un = (unsigned) x + (unsigned) y}.in;
}
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
1

It seems ridiculous, but I think that the recommended method is to use memcpy. Apparently all modern compilers optimize the memcpy away and it ends up doing just what you're hoping in the first place -- preserving the bit pattern from the unsigned addition.

int a;
int b;

unsigned u = (unsigned)a + b;
int result;
memcpy(&result, &u, sizeof(result));

On x86 clang with optimization, this is a single instruction if the destination is a register.

Andrew
  • 609
  • 7
  • 14
0

I'm not so sure because of the rules for casting from unsigned to signed

You exactly quoted the rules. If you convert from a unsigned value to a signed one, then the result is implementation-defined or a signal is raised. In simple words, what will happen is described by your compiler.

For example the gcc9.2.0 compiler has the following to in it's documentation about implementation defined behavior of integers:

The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3).

For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.

Community
  • 1
  • 1
KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • `INT_MAX - y` is a signed arithmetic operation that overflows when `y` is less than 0. You need an `if` to get the semantics right. – trent Dec 12 '19 at 17:15
  • Oooooch, right. So a little refactoring is needed, I see. – KamilCuk Dec 12 '19 at 17:15
  • This was exactly the point of my question. I can spit out a tangled mess of `if`-`else` statements that performs all the checks, but it is horrible and error prone. I wanted to know whether there is a safer way, possibly relying on intrinsics. – Federico Dec 12 '19 at 17:21
  • `possibly relying on intrinsics` But, but, but [intrinsic](https://en.wikipedia.org/wiki/Intrinsic_function) themselves are compiler specific. So you are tied to a specific compiler. So, well, I don't understand, then your question makes no sense. I understand you seem to be searching for a standard compliant solution that is not standard... The standard compliant solution (the tangled mess of if-else) it _is the_ solution. If your architecture is twos complement, just add them. – KamilCuk Dec 12 '19 at 17:43
  • @KamilCuk I edited the question. I included the if-else-magic mess, which is not trivial to get right (as your answer demonstrates). Unfortunately clang is not smart enough to compile it to a single `leal`. A standard compliant solution could also be to examine the input `int`s bit by bit and perform addition in software. This is clearly not what I'm looking for. I would like a solution which is both correct (doesn't rely on UB) and gets compiled to decent code (at least by gcc, clang, icc and whatever those poor folks use under windows) – Federico Dec 12 '19 at 17:50
  • The `leal` instruction is architecture specific. – KamilCuk Dec 12 '19 at 17:52
  • Then please specify the machine you are interested in, the compilers you are interested in, the environments you are interested in, in your question. As it is now, I guess you are asking: "How do I write standard compliant code that wrapes signed integer addition that compiles to a single (or define a condition) instruction on specified compilers and architectures?". A such question I find not worth answering - I would just `#ifdef __GNUC__ return x + y` handle each compiler. Or write a assertion that the platform is twos complement. – KamilCuk Dec 12 '19 at 17:59
  • `#ifdef __GNUC__ return x + y` is undefined behavior. If you then do `add_wrap(x, 1) > x`, gcc will simplify it to `1`. I don't want that. – Federico Dec 12 '19 at 18:13
  • The recommendations presented [here](https://wiki.sei.cmu.edu/confluence/display/c/INT32-C.+Ensure+that+operations+on+signed+integers+do+not+result+in+overflow) are pretty much garbage. The compilers are not able to simplify them to a single `add`+`jo`. Hence my question: is this the only way? Or is there a standard compliant way to write code which is not garbage? – Federico Dec 12 '19 at 18:15
0

I had to do something similar; however, I was working with known width types from stdint.h and needed to handle wrapping 32-bit signed integer operations. The implementation below works because stdint types are required to be 2's complement. I was trying to emulate the behaviour in Java, so I had some Java code generate a bunch of test cases and have tested on clang, gcc and MSVC.

inline int32_t add_wrap_i32(int32_t a, int32_t b)
{
    const int64_t a_widened = a;
    const int64_t b_widened = b;
    const int64_t sum = a_widened + b_widened;

    return (int32_t)(sum & INT64_C(0xFFFFFFFF));
}

inline int32_t sub_wrap_i32(int32_t a, int32_t b)
{
    const int64_t a_widened = a;
    const int64_t b_widened = b;
    const int64_t difference = a_widened - b_widened;

    return (int32_t)(difference & INT64_C(0xFFFFFFFF));
}

inline int32_t mul_wrap_i32(int32_t a, int32_t b)
{
    const int64_t a_widened = a;
    const int64_t b_widened = b;
    const int64_t product = a_widened * b_widened;

    return (int32_t)(product & INT64_C(0xFFFFFFFF));
}
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Michael Barker
  • 14,153
  • 4
  • 48
  • 55