-3

I have a really simple program:

int main()
{
  char* num = new char[5];
  sprintf(num, "65536");
  std::cout << "atoi(num): " << atoi(num) << "\n";
}

The max size of an unsigned INT is 65535. Why doesn't this program overflow when I run atoi(65536)?

too honest for this site
  • 12,050
  • 4
  • 30
  • 52
  • Did you run your program through a memory debugger like Valgrind or ASAN? – Kerrek SB Jan 20 '16 at 00:31
  • 6
    note that "65536" won't fit into a char[5]. you need to allocate space for the null character – bruceg Jan 20 '16 at 00:33
  • 5
    Maybe your assumption of the maximum size if an unsigned int is wrong? Check `std::numeric_limits::max()`. Also note you have undefined behaviour because `num` is not null-terminated. – juanchopanza Jan 20 '16 at 00:33
  • @juanchopanza, so it looks like the reference I was using was wrong. The max unsigned int would be `2147483647`. However, `atoi()` still doesn't break under these conditions. What concept am I missing? –  Jan 20 '16 at 00:38
  • @n0pe It isn't wrong or right. `unsigned int` has to be at least 16 bits. That's all. You have to check what it is on your platform. But it is *extremely* likely to be 32 bits. – juanchopanza Jan 20 '16 at 00:39
  • Why are you guys keep talking about `unsigned int` when `atoi` is used? – Rostislav Jan 20 '16 at 00:40
  • Is this what you want to see? http://rextester.com/ERIGHP50494 – Rostislav Jan 20 '16 at 00:42
  • @Rostislav is the size of int and of unsigned int different on the same system? Also, the question was about unsigned int, not about atoi ;) – Serge Jan 20 '16 at 00:46
  • @Rostislav: That is undfined behaviour on most systems (including the smaller ones). So it proves just nothing. – too honest for this site Jan 20 '16 at 00:47
  • @Olaf Yep, I know, just wanted to understand what the OP actually expected :) – Rostislav Jan 20 '16 at 00:52
  • @juanchopanza: Many platforms I use have 16 bit `int`. IIRC, one even had non-standard 8 bit `int`. – too honest for this site Jan 20 '16 at 00:52
  • @Serge While the size is the same, the range of values is different, which is kinda important in this case, but given that the question is kind of underspecified, it's hard to tell. – Rostislav Jan 20 '16 at 00:53
  • @Rostislav: I'm actually not sure OP knows himself. I'm still waiting for clarification about `MAX_INT` (or whatever C++ provides). – too honest for this site Jan 20 '16 at 00:53
  • @Olaf Yeah. Well, if I'd write an answer it would certainly include a note about UB (IN HUGE LETTERS :)), but these are comments so I felt it was somewhat out of place. – Rostislav Jan 20 '16 at 00:55
  • @n0pe: `UINT_MAX == 2147483647` is very unlikely, as that is 31 1-bits. It might be `INT_MAX`, though. – too honest for this site Jan 20 '16 at 01:11
  • @Rostislav, yes, I agree, of course. Anyway, the statement that `int` type in C and C++ is `at least 16 bits wide` is true, and all questions you and @Olaf put are legit – Serge Jan 20 '16 at 01:19
  • `atoi` is dumb as a post. It returns 0 on utterly failed conversions (converting "the" for example), halts with and doesn't tell you where ("10the" returns 10) gleefully reads past the max value of `int` without warning ("5000000000" may return 705032704). Don't use it unless you've already strenuously tested the input. Use the likes of [`strtol`](http://en.cppreference.com/w/c/string/byte/strtol) (and test for proper size and proper end pointer location) or [`std::stoi`](http://en.cppreference.com/w/cpp/string/basic_string/stol) (and handle the thrown exceptions for bad input). – user4581301 Jan 20 '16 at 01:30

1 Answers1

5

On current PC systems, an int is usually 32 bits or even 64 bits (except one some smaller platforms such as Arduino).

So, probably on your system an int (or unsigned int) is larger than 16 bits and 65536 should not overflow. You could easily check this with:

std::cout << sizeof(int) << "\n";

Also, there is no space in num for the null-terminator:

char* num = new char[5];
sprintf(num, "65536");

So sprintf() will write a terminating \0 one past your buffer, causing undefined behavior:

There is no way to limit the number of characters written, which means that code using sprintf is susceptible to buffer overruns.

This should be changed to:

char* num = new char[6];
Danny_ds
  • 11,201
  • 1
  • 24
  • 46
  • 1
    An `int` is usually 16 bits, not 32. If you count CPu cores which use it, the number with smaller `int` outnumbers the ones with 32 bit `int` by decades. And there are also `int` with 64 bits or - more exotic nowadays - 24 or 18 bits. The standard does not define a specific bit-size for int or the other integer types, but only a minimum range. – too honest for this site Jan 20 '16 at 00:39
  • not sure what it is in c++ but in java an int is 32 -bit – Ibukun Muyide Jan 20 '16 at 00:42
  • 1
    And there is no need to terminate `num` before calling `sprintf`. The problem is not missing trermination, but not enough entries being allocated (as you apparently noticed). `sprintf` will very well write a terminator - just out of bounds. – too honest for this site Jan 20 '16 at 00:43
  • 1
    @Ibukun: Maybe there is a reason they are supposed to be different languages. Same syntax/grammar does not imply same semantics. – too honest for this site Jan 20 '16 at 00:44
  • Btw, integer overflow is undefined behaviour. So it may very well show strange results. But - acknowledged - this is not likely. – too honest for this site Jan 20 '16 at 00:46
  • @Olaf - _" sprintf will very well write a terminator - just out of bounds."_ - Yes, I've seen that in the debugger, I meant no \0 in the array, I'll rephrase it. – Danny_ds Jan 20 '16 at 00:46
  • Actually, for 16 bit `int`, `32768` already will overflow, thus invoke UB. – too honest for this site Jan 20 '16 at 00:51
  • Happily removed the DV. Maybe you can still clarify that writing beyound array boundaries is UB. One can never emphasise that enough :-( – too honest for this site Jan 20 '16 at 00:57
  • @NathanOliver Is it not UB? [standardese quote](http://stackoverflow.com/a/16188846/3589890) – Rostislav Jan 20 '16 at 00:59
  • UV for adding the notes about UB and `sprintf`. I don't work much with these functions. On embedded they are useless and on PC I use Python. – too honest for this site Jan 20 '16 at 01:05
  • @olaf my mistake. I had it confused with bit shifting – NathanOliver Jan 20 '16 at 01:05
  • @NathanOliver: Not sure about C++, but in C, left-shift also has potential for UB. Only certain right-shifts are IB (strange enough. I tend to treat both as UB, thus my common advise is just not to shift signed integers). – too honest for this site Jan 20 '16 at 01:08
  • @Olaf - Thanks - actually I also don't use sprintf much. (Oh, and I love those 16 bit systems, just don't have enough time to _work_ with them). – Danny_ds Jan 20 '16 at 01:13
  • @Danny_ds: While I like some of the the smaller 8 or 16 bitters, I'm quite happy with the Cortex-M 32 bitters right now. Not much use in the smaller ones unless you need very low power. OTOH, once hobby becomes job, there are always ups and downs. – too honest for this site Jan 20 '16 at 01:15