1

I want to fix some problem about overflow. I have some data use int to store, the data does not cause overflow but the calculation intermediate may cause overflow.

For example, I need to store the diagonal of square, the length of side is 50000, so the diagonal is 70710, which the side and diagonal is far smaller than INT_MAX, but for calculation, aa+bb in sqrt(aa+bb) will cause overflow.

I want to follow "just use int" rule, so I may need to cast each variable every time:

int f=(long)a+(long)b*(long)c/(long)d-(long)e;

but each time add (long) affects readability, I test which operation may cause overflow and which may have auto cast:

#include <sstream>
int main(){
    int a=rand();
    int b=a;
    printf("%d\n",a);
    printf("%d\n",INT_MAX);
    printf("\n");
    printf("%d\n",INT_MAX+a-b);
    printf("%d\n",INT_MAX-b+a);
    printf("%d\n",a+INT_MAX-b);
    printf("%d\n",a-b+INT_MAX);
    printf("%d\n",-b+a+INT_MAX);
    printf("%d\n",-b+INT_MAX+a);
    printf("\n");
    printf("%d\n",INT_MAX*a/b);
    printf("%d\n",INT_MAX/b*a);
    printf("%d\n",a*INT_MAX/b);
    printf("%d\n",a/b*INT_MAX);
    printf("\n");
    printf("%ld\n",(long)INT_MAX*a/b);
    printf("%ld\n",INT_MAX*a/(long)b);
    return 0;
}

the output is:

16807
2147483647

2147483647
2147483647
2147483647
2147483647
2147483647
2147483647

127772
2147480811
127772
2147483647

2147483647
127772

I use rand() to ensure no compile time calculation, I found for + and - the result is the same for different sequence of INT_MAX,+a and -b, but for *a and /b it is not.

Also I found even use casting, (long)INT_MAXa/b is normal but INT_MAXa/(long)b is not.

I guess for + and -, if the result is smaller than INT_MAX, it would not cause overflow even the calculation intermediate (e.g.:INT_MAX+a in INT_MAX+a-b) may cause overflow, but for * and / ,the overflow intermediate would affect the result, is it right?

Also for * and /, I guest the operation starts from left hand side, so casting need start from left hand side (e.g.:(long)INT_MAX*a/b), is it also right?

So, if my data does not cause overflow but the calculation may cause overflow, is

int f=a+b*c/d-e;

only need to rewrite as

int f=a+(long)b*c/d-e;

?

ggrr
  • 7,737
  • 5
  • 31
  • 53
  • 3
    1. signed integer overflow is undefined behaviour. It might look like it works at one moment but start breaking in a week without any change. 2. long might be the same size as int and you still overflow (I know that long and int are the same size on 32 bit linux as well as both 32bit and 64 bit windows for one). – AliciaBytes Jun 11 '15 at 03:16
  • you only need 1 cast like `int f = a + b*(long)c/d - e;` (FYI, adding spaces and newlines appropriately also help increase readability), because the other operand in an operation will be automatically promoted accordingly. And you need a really wider type than `int`, which `long` doesn't guarantee – phuclv Jun 11 '15 at 03:53
  • `INT_MAX+a` results in a temporary value that fits in an `int` that would be interpreted as a negative number by itself. See http://ideone.com/mOiRS6 for a slightly modified version of your code. – R Sahu Jun 11 '15 at 04:00
  • 1
    @RSahu: No, the evaluation of `INT_MAX+a` yields a value that doesn't fit in the result type, `int`, that's why you get undefined behavior. – Ben Voigt Jun 11 '15 at 04:13
  • @BenVoigt, agree with you in theory. However, if the result can be replicated in a particular run time environment over multiple runs, I try to understand why that happens. – R Sahu Jun 11 '15 at 04:40
  • There is a discussion of C/C++ type promotion rules here: https://stackoverflow.com/questions/5563000/implicit-type-conversion-rules-in-c-operators . Yes, one `long` will do to move the entire computation to being done as `long`. Though you do need to be mindful of associativity as well. – Zalman Stern Nov 02 '17 at 00:17

1 Answers1

-1

data does not cause overflow but the calculation intermediate may cause overflow.

To avoid int overflow, which is undefined behavior, the simplest solution is to use a wide enough integer type.

int foo1(int a, int b, int c, int d) {
  int f=(long)a+(long)b*(long)c/(long)d-(long)e; // OP's stating point, but see foo2
  return f;
}

but each time add (long) affects readability

To avoid unnecessary casting and its readability, use * one only where needed. A good compiler will optimize out the explicit multiplication yet retain the type promotion.

int foo2(int a, int b, int c, int d) {
  int f = a + 1L*b*c/d - e; // Cleaner yet see foo3
  return f;
}

To insure a potential wider type is wide enough (long may be the same width as int), perform compile time tests

// Find a type where INT_MAX*INT_MAX <= some_type_MAX
#if LONG_MAX/INT_MAX >= INT_MAX
  #define WIDE1 1L
#elif LLONG_MAX/INT_MAX >= INT_MAX
  #define WIDE1 1LL
#elif INTMAX_MAX/INT_MAX >= INT_MAX
  #define WIDE1 ((intmax_t)1)
#else
  #error Out of luck
#endif

int foo3(int a, int b, int c, int d) {
  int f = a + WIDE1*b*c/d - e;
  return f;
}

To only use type int math is work to be avoided.


.. but for calculation, aa+bb in sqrt(aa+bb) will cause overflow.

For this case

int hypoti1(int a, int b) {
  return sqrt(WIDE1*a*a + WIDE1*b*b);
}

// or simply

int hypoti1(int a, int b) {
  return hypot(a, b);
}
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • This code is not very readable and I would not let it pass code review. – Zalman Stern Nov 02 '17 at 00:07
  • @ZalmanStern It is good you posted a comment along with the DV. Please provide more detail or alternatives so the answer may be improved. – chux - Reinstate Monica Nov 03 '17 at 12:43
  • My objection is the multiplication technique obscures the type widening. Having spent lots of time reverse engineering imaging pipelines, I prefer code to be explicit in its precision requirements. Using a cast makes the intention more clear and only one is required. In C++ one can use type traits techniques to infer the type to use, though in practice it is probably better to use explicitly sized integer types and to make the code concrete in the bit sizes it handles. Beyond this, there are techniques to ensure overflow does not occur due to intermediate operations, which is what hypot does. – Zalman Stern Nov 05 '17 at 17:55
  • @ZalmanStern "Using a cast makes the intention more clear" `(some_type)a*a` does insure the multiplication occurs with at least the width of type `(some_type)` and `a`. Yet casting has the potential to narrow `a` - hence my desire to avoid it. `WIDE1*a*a` does not, in anyway, narrow the multiplication. Multiplication of `type`*`type` to some `2x wide type` is best done with a helper function rather than casting or the `WIDE1*` idea. – chux - Reinstate Monica Nov 06 '17 at 00:57