14

When recently answering another question, I discovered a problem with code like:

int n;
scanf ("%d", &n);

With strtol, you can detect overflow because, in that case, the maximum value allowed is inserted into n and errno is set to indicate the overflow, as per C11 7.22.1.4 The strtol, strtoll, strtoul, and strtoull functions /8:

If the correct value is outside the range of representable values, LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is returned (according to the return type and sign of the value, if any), and the value of the macro ERANGE is stored in errno.

However, in the sections of the standard dealing with scanf, specifically C11 7.21.6.2 The fscanf function /10, we see:

If this object does not have an appropriate type, or if the result of the conversion cannot be represented in the object, the behavior is undefined.

Now, to me, that means any value can be returned and there's no mention of errno being set to anything. This came to light because the asker of the linked question above was entering 9,999,999,999 into a 32-bit int and getting back 1,410,065,407, a value 233 too small, indicating it had simply wrapped around at the limit of the type.

When I tried it, I got back 2,147,483,647, the largest possible 32-bit unsigned value.

So my question is as follows. How do you detect integral overflow in a portable way when using the scanf family of functions? Is it even possible?

Now I should mention that, on my system (Debian 7), errno is actually set to ERANGE in these circumstances but I can find nothing in the standard that mandates this. Additionally, the return value from scanf is 1, indicating success in scanning the item.

Community
  • 1
  • 1
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • I can see one reason why `errno` does not work well here: `scanf` can have various conversions. Which of them does `errno` apply to? (If setting `errno` were equivalent to conversion failure, that would be unambiguous, but apparently that's not the case, not even in `strtol`.) – M Oehm Jan 18 '15 at 06:15
  • what is your system? you say _on my system `errno` ..._ – Iharob Al Asimi Jan 18 '15 at 07:46
  • @iharob: Debian 7,I'll update the question but I'm not really after an implementation-specific thing, rather one that's "kosher" according to the standard. – paxdiablo Jan 18 '15 at 11:36
  • The FIRST thing you should do, is check the return value from scan -- it returns the number successful conversions. In your example if (scanf (..) != 1) you've got an error. – Anonymouse Jan 18 '15 at 12:22
  • @Anonymouse: yes except, as stated in the last paragraph, it converts successfully even for the overflow case, it just doesn't give you the value entered. That's what I'm looking for, a portable solution where the return code indicates it succeeded and the errno is still zero, but the number overflowed. – paxdiablo Jan 18 '15 at 12:26
  • 2
    What is the point of digging into this matter? stdio.h, and thus scanf, have always been good enough only for very basic input. While scanf may be fine for the classroom, or for simple number crunching applications, you can not safely depend on it for anything beyond that. If you want to protect the user from typing wrong data, control or function sequences, you must consider writing your own library and fetch the data from the input buffer. – Costis Aivalis Jan 18 '15 at 18:03
  • Costis, yes, I've even written such things. But the _point_ is to get the question, which is not yet on SO, out there so that others may learn as well. It's certainly _possible_ that there may be a way, I dont use `scanf` enough to know for sure. But, even if there isn't (and that looks to be the case from the standard), that's still increasing the knowledge here. – paxdiablo Jan 19 '15 at 00:51
  • The very fact that someone _asked_ the question I linked to means that a question asking if it's possible to do it portably could be useful. – paxdiablo Jan 19 '15 at 00:56
  • @paxdiablo: Fair enough! It never hurts to dig and seek. I probably happen to be a little biased against stdio since back in the early 80's we had to spend days implementing alternate solutions for safe data input because of its deficiencies. – Costis Aivalis Jan 19 '15 at 20:02

1 Answers1

6

The only portable way is to specify a field width, e.g. with "%4d" (guaranteed to even fit into a 16-bit int) or by building up the format string at run-time with a field width of (int)(log(INT_MAX) / log(10)). This of course also rejects for example 32000, although it would fit into a 16-bit int. So no, there is no satisfying portable way.

POSIX don't specify more here, nor mention ERANGE.

This manpage mentions setting errno only in case EOF is returned; the glibc documentation doesn't mention ERANGE at all.

That leaves the question what to suggest to beginners for reading integers, where I have no idea. scanf has too many undefined and underspecified aspects to be really useful, fgets cannot be used in productive code because you cannot handle 0-bytes properly, and portable error checking with strtol and friends takes more lines than implementing the functionality yourself (and is quite easy to get wrong). The behaviour of atoi is also undefined for integer overflow.

mafso
  • 5,433
  • 2
  • 19
  • 40