12

Possible Duplicate:
Best way to detect integer overflow in C/C++

I was asked this question in an interview : "Convert a string representation of a number into an integer". But if the number exceeds the max value that can be stored in a 32 bit integer, it must throw an error. My question is how can we check if a number overflows a 32-bit unsigned int ?

Community
  • 1
  • 1
AlgoMan
  • 131
  • 1
  • 1
  • 4
  • 1
    http://stackoverflow.com/questions/199333/best-way-to-detect-integer-overflow-in-c-c – Debobroto Das Dec 01 '12 at 16:54
  • What is the representation? Is `"two thousand fifty one"` an allowed input? – chill Dec 01 '12 at 16:57
  • Convert MAX/MIN values of your target type to string and compare the string with the input, first looking at the length then at the characters... – effeffe Dec 01 '12 at 17:04

3 Answers3

12

In the loop, you've got the current value of the integer determined by the first digits. You're about to do:

curval = curval * 10 + digit;  // digit converted from '0'..'9' to 0..9 already

You can test for overflow with:

if (curval > UINT32_MAX / 10 || (curval == UINT32_MAX / 10 && digit > UINT32_MAX % 10))
    ...overflow would occur...
else
    curval = curval * 10 + digit; // overflow did not occur

The division and modulus operations are compile time operations, not run time operations.

One advantage of this approach over using a 'larger than 32-bit unsigned integer for the calculation' is that it can be extended to work with bigger integers — with 64-bit integers and uintmax_t integers — by changing the constant from UINT32_MAX to UINT64_MAX or UINTMAX_MAX. That's all that's necessary (other than changing the variable types, of course).


The same basic test also works for signed types, but there's an added complication. It's easier to use 16-bit integers for demo purposes (fewer digits to type and think about), so I'm switching to 16-bit two's-complement short with:

  • USHRT_MAX = 65535
  • SHRT_MAX = 32767
  • SHRT_MIN = -32768

The problem is that the short type can store -32768 but the largest positive value you can accumulate is 32767. For all values except SHRT_MIN, the test above would work fine. There are ways around this dilemma. One is to store the curval as a negative value during the calculation, and to negate it before returning if it should be positive. Then your testing might be:

if (curval < SHRT_MIN / 10 || (curval == SHRT_MIN / 10 && digit > -(SHRT_MIN / 10))
    ...overflow would occur...
else
    curval = curval * 10 - digit; // Note subtraction!

You still have to check that curval is not SHRT_MIN before negating the negative value. (In theory, you should check that -SHRT_MAX != SHRT_MIN to allow for other systems than two's complement binary; the values I quoted for the limits satisfy that condition.)

The same thinking and techniques apply to INT_MAX, INT32_MAX, INT64_MAX and INTMAX_MAX, of course.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
7

(I looked at the possible duplicate to this question and I'm uncertain how useful that reading will be, since in an interview context, you'd be expected describe the general ideas / approach)


Part of answering interview questions is to ask clarifying questions (one of the important checklist items that interviewers expects from you).

So your line of inquiry should at least include things like:

  • Are you allowed to use unsigned long long / uint64_t (64 bit int)?
  • Where is the string coming from? stdin / read from file / hard-coded as a string literal?
  • Is it given to you as a Decimal representation? Binary / Octal / Hex?
  • What kind of error should be thrown? Do we try to recover and continue execution? What's the fail-safe default value?

Assuming that your interviewer tells you that it's in Decimal representation + you can use unsigned long long:

  • First check the number of digits: 32-bit unsigned int ranges between 0 to 4,294,967,295. If the number's string representation in decimal has more than 10 characters (has more than 10 digits), it already overflows; raise error.
  • Next, save the number into an unsigned long long and divide by UINT32_MAX+1=4294967296. If the result is greater than 0, it overflows; raise error.
Community
  • 1
  • 1
sampson-chen
  • 45,805
  • 12
  • 84
  • 81
  • 2
    Why would you count characters in the string, *and* do division instead of a simple comparison? That's pretty complicated. Also, better to be explicit and use types like `uint64_t` than `unsigned long long`. – Brian Campbell Dec 01 '12 at 17:16
  • 1
    @BrianCampbell Checking the characters handles the case where the number would also overflow the unsigned long. – PerpetualStudent Apr 24 '21 at 14:09
0

Do the conversion using a uint64_t. You are going to add one digit at a time; for each digit, you will multiply by 10 and add the value of the next character. After processing each character, check to see if the result is greater than UINT32_MAX; if it is, exit with an error (or whatever you're supposed to do on overflow). Once you're done, return the number as a uint32_t. uint64_t, uint32_t and UINT32_MAX are defined in stdint.h.

Brian Campbell
  • 322,767
  • 57
  • 360
  • 340
  • what if the number is `UINT32_MAX + 1`? ;) it'd overflow the 64-bit int without a check first. – sampson-chen Dec 01 '12 at 17:04
  • 2
    @sampson-chen: How would UINT32_MAX + 1 overflow a 64-bit int? – Jonathan Leffler Dec 01 '12 at 17:05
  • @JonathanLeffler derp, I meant `UINT64_MAX` instead of `UINT_32`, lack of sleep (and coffee) : – sampson-chen Dec 01 '12 at 17:06
  • @sampson-chen The point is, you check after processing every character. You are not going to overflow a `uint64_t` by taking a number less than `UINT32_MAX`, multiplying by 10, and adding a number between 0 and 9 to it. – Brian Campbell Dec 01 '12 at 17:12
  • @BrianCampbell ah, I see what you mean now after looking at Jonathan's answer. – sampson-chen Dec 01 '12 at 17:20
  • @effeffe I'm not sure I understand your objection. The question asked about a 32 bit integer; this approach will work fine for a 32 bit integer, as there is plenty of extra space in a `uint64_t`. This is the simple, easy answer; Jonathan Leffler's is the best if you can't use a something larger than a 32 bit number for intermediate storage. – Brian Campbell Dec 01 '12 at 17:22
  • @BrianCampbell The problem with `uint64_t` is that it need not be provided (it ususally is, but on hardware with 36-bit words, it wouldn't). `uint_least64_t` or `unsigned long long` are required by the standard. – Daniel Fischer Dec 01 '12 at 22:21