6

Every now and then, especially when doing 64bit builds of some code base, I notice that there are plenty of cases where integer overflows are possible. The most common case is that I do something like this:

// Creates a QPixmap out of some block of data; this function comes from library A
QPixmap createFromData( const char *data, unsigned int len );

const std::vector<char> buf = createScreenShot();
return createFromData( &buf[0], buf.size() ); // <-- warning here in 64bit builds

The thing is that std::vector::size() nicely returns a size_t (which is 8 bytes in 64bit builds) but the function happens to take an unsigned int (which is still only 4 bytes in 64bit builds). So the compiler warns correctly.

If possible, I try to fix up the signatures to use the correct types in the first place. However, I'm often hitting this problem when combining functions from different libraries which I cannot modify. Unfortunately, I often resort to some reasoning along the lines of "Okay, nobody will ever do a screenshot generating more than 4GB of data, so why bother" and just change the code to do

return createFromData( &buf[0], static_cast<unsigned int>( buf.size() ) );

So that the compiler shuts up. However, this feels really evil. So I've been considering to have some sort of runtime assertion which at least yields a nice error in the debug builds, as in:

assert( buf.size() < std::numeric_limits<unsigned int>::maximum() );

This is a bit nicer already, but I wonder: how do you deal with this sort of problem, that is: integer overflows which are "almost" impossible (in practice). I guess that means that they don't occur for you, they don't occur for QA - but they explode in the face of the customer.

Frerich Raabe
  • 90,689
  • 19
  • 115
  • 207

5 Answers5

4

If you can't fix the types (because you can't break library compatibility), and you're "confident" that the size will never get that big, you can use boost::numeric_cast in place of the static_cast. This will throw an exception if the value is too big.

Of course the surrounding code then has to do something vaguely sensible with the exception - since it's a "not expected ever to happen" condition, that might just mean shutting down cleanly. Still better than continuing with the wrong size.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
2

The solution depends on context. In some cases, you know where the data comes from, and can exclude overflow: an int that is initialized with 0 and incremented once a second, for example, isn't going to overflow anytime in the lifetime of the machine. In such cases, you just convert (or allow the implicit conversion to do its stuff), and don't worry about it.

Another type of case is fairly similar: in your case, for example, it's probably not reasonable for a screen schot to have more data that can be represented by an int, so the conversion is also safe. Provided the data really did come from a screen shot; in such cases, the usual procedure is to validate the data on input, ensuring that it fulfills your constraints downstream, and then do no further validation.

Finally, if you have no real control over where the data is coming from, and can't validate on entry (at least not for your constraints downstream), you're stuck with using some sort of checking conversion, validating immediately at the point of conversion.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
1

If you push a 64-bit overflowing number into a 32-bit library you open pandora's box -- undefined behaviour.

Throw an exception. Since exceptions can in general spring up arbitrarily anywhere you should have suitable code to catch it anyway. Given that, you may as well exploit it.

Error messages are unpleasant but they're better than undefined behaviour.

spraff
  • 32,570
  • 22
  • 121
  • 229
0

Such scenarios can be held in one of four ways or using a combination of them:

  • use right types
  • use static assertions
  • use runtime assertions
  • ignore until hurts

Usually the best is to use right types right until your code gets ugly and then roll in static assertions. Static assertions are much better than runtime assertions for this very purpose.

Finally, when static assertions won't work (like in your example) you use runtime assertions - yes, they get into customers' faces, but at least your program behaves predictably. Yes, customers don't like assertions - they start panic ("we have error!" in all caps), but without an assertion the program would likely misbehave and no way to easily diagnose the problem would be.

Community
  • 1
  • 1
sharptooth
  • 167,383
  • 100
  • 513
  • 979
  • 1
    I am not sure about the `static`. This would require the size to be known at compile time – Sebastian Mach Jul 26 '11 at 07:30
  • @phresnel: Yes, and usually you know the size at compile time. – sharptooth Jul 26 '11 at 07:33
  • Isn't using static assertions here basically the same as using the right types (the latter being the best, of course, but sometimes I cannot adjust the function signatures)? I.e., in a 64bit build, a static assertion comparing the sizes of `size_t` and `unsigned int` would fail - but how does that help? Or do you mean to assert something else? – Frerich Raabe Jul 26 '11 at 07:35
  • @Frerich Raabe: In this case yes, but there're many other cases where you can use a static assertion. My point was that if you have to choose between static and runtime assertion you should choose a static assertion. – sharptooth Jul 26 '11 at 07:37
  • @sharptooth: Are talking about `sizeof(size_t)`, and not `buf.size()`? – Sebastian Mach Jul 26 '11 at 07:39
  • @phresnel: I misread the question a bit. I edited my answer. Yes, in some cases you will need a runtime check. – sharptooth Jul 26 '11 at 07:41
  • @Frerich Raabe: I misread the question a bit. I edited my answer. Yes, in some cases you will need a runtime check. – sharptooth Jul 26 '11 at 07:43
  • @sharptooth: Ah, okay. static-asserts are of course great, I was just confused about using it on vector::size() :) – Sebastian Mach Jul 26 '11 at 07:45
0

One thing just came to my mind: since I need some sort of runtime check (whether or not the value of e.g. buf.size() exceeds the range of unsigned int can only be tested at runtime), but I do not want to have a million assert() invocations everywhere, I could do something like

template <typename T, typename U>
T integer_cast( U v ) {
    assert( v < std::numeric_limits<T>::maximum() );
    return static_cast<T>( v );
}

That way, I would at least have the assertion centralized, and

return createFromData( &buf[0], integer_cast<unsigned int>( buf.size() ) );

Is a tiny bit better. Maybe I should rather throw an exception (it is quite exceptional indeed!) instead of assert'ing, to give the caller a chance to handle the situation gracefully by rolling back previous work and issueing diagnostic output or the like.

Frerich Raabe
  • 90,689
  • 19
  • 115
  • 207
  • 2
    Actually this is not very good - assertions bear a file name and a line number with them and with this approach all will have the same file+line. – sharptooth Jul 26 '11 at 07:56