-2

Why is it

char* itoa(int value, char* str, int base);

instead of

char (*itoa(int value, char (*str)[sizeof(int) * CHAR_BIT + 1], int base))[sizeof(int) * CHAR_BIT + 1]

which would protect programmers from buffer overruns?

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • 6
    Because it's *your* job to protect from buffer overruns in C – Mad Physicist Aug 26 '19 at 12:26
  • 5
    And what if the buffer passed to the function *isn't* of size `sizeof(int) * CHAR_BIT + 1`? What if it's a dynamically allocated pointer as returned by `malloc`? What if you want to pass the returned pointer to another function that only wants pointers to null-terminated strings? – Some programmer dude Aug 26 '19 at 12:26
  • 1
    It's quite different from the obsolete `gets` function where there was *nothing* the programmer could do to protect against buffer overruns. – Weather Vane Aug 26 '19 at 12:27
  • @MadPhysicist no there are many things in C helping you avoid buffer overruns, such as all the _s functions – Artikash-Reinstate Monica Aug 26 '19 at 12:31
  • 4
    @Artikash those are mostly Microsoft extensions to C and it's a matter of opinion whether they make the code any safer or not, partly because users frequently mistake how they should be used. And they are non-portable. – Weather Vane Aug 26 '19 at 12:34
  • @Someprogrammerdude then it's a security risk and shouldn't be passed. the pointer from malloc should be casted to a `char(*)[sizeof(int * CHAR_BIT + 1]` if that's its use. if you have `void foo(char* null_terminated)` call `foo(*itoa(value, buffer, base))` and the array will decay – Artikash-Reinstate Monica Aug 26 '19 at 12:36
  • @Artikash it will come on the expense of performance. – Tony Tannous Aug 26 '19 at 12:45
  • 2
    @Artikash. At this point, your question is opinion based. There are low level languages that have a philosophy aligned with what you would like to see. C just isn't one of them. – Mad Physicist Aug 26 '19 at 12:46
  • 1
    According to your point of view, basically half of the C standard functions are "declared unsafely". – Marco Bonelli Aug 26 '19 at 13:06
  • A very simple way to find out the number of digits in a number: `strlen(itoa(some_number, some_buffer_of_sufficient_size, 10))`. If the buffer have place for 12 or more characters, then there's nothing unsafe here. As usual, C give you a very powerful gun, it's your responsibility as a programmer to not shoot yourself in the foots with it. – Some programmer dude Aug 26 '19 at 13:13
  • @WeatherVane Since the C11 bounds-checking interface, `_s` is nowadays pretty much an indication that a function is _unsafe_... though arguably this was always the case. – Lundin Aug 26 '19 at 13:53
  • `itoa` is not a Standard function, so it's difficult to discuss it definitively. But an `itoa` that's defined to accept a buffer for the result, without allowing the caller to specify the size of that buffer, is faulty, pure and simple. The fault doesn't end up being *quite* as bad as for `gets`, but it's still a fault. Why does it exist? Because people are careless, and/or because early C had a propensity for this sort of "cowboy style" programming. – Steve Summit Aug 26 '19 at 15:53

2 Answers2

1

Why is itoa declared unsafely?

Note that itoa() is not a standard C library function and implementations/signatures vary.

itoa(int value, char* str, int base) is a building block function some libraries provide. It is meant to be efficient when used by a knowledgeable coder that inures adequate buffer space.

Yet, itoa(int value, char* str, int base); lacks a buffer size safe-guard. It is easy to mis-calculate worst case, as OP did.

Suggested itoa(char *dest, size_t size, int a, int base) implementation and give up a minor efficiency for buffer size checking. Note: the caller should check the return value.


Care must be taken with trying to predict maximum string needs.

Consider below as suggested by OP:

#define INT_N (sizeof(int) * CHAR_BIT + 1)
char (*itoa(int value, char (*str)[INT_N], int base))[INT_N]

Conversion of a signed 32-bit INT_MIN to base 2 is

"-10000000000000000000000000000000"

INT_N is 33, yet the buffer size needed is 34.


chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • Deeper: IMO, A reason for the "unsafe" `itoa(int value, char* str, int base)` is that buffer management, while left to the the caller, is problematic to push into the function. My alternate `iota()` prevents buffer overrun and UB, yet the caller _still_ must check the return value. To require a worst case buffer always be provided (e.g. 34), makes `itoa()` usage cumbersome when the calling codes knows a smaller buffer will suffice. C is fast and lean. Lean-ness comes with a price. – chux - Reinstate Monica Aug 26 '19 at 13:29
  • The safe variant would be `asprintf(&dest, "%d", value);`. Or a similar `fprintf()` with a `FILE*` that came out of an `open_memstream()` call. Unless you are forced to support non POSIX.1-2008 compliant systems, there's little excuse to rely on the unsafe string manipulation functions, and even then it's generally easier to provide your own `asprintf()` implementation relying on `vsnprintf()` than to try to get every buffer allocation right. – cmaster - reinstate monica Aug 26 '19 at 13:33
  • @cmaster Rather than comment with an alternative answer here, posting your `asprintf()` would allow others to vote on it. – chux - Reinstate Monica Aug 26 '19 at 14:20
  • My comment is not an answer as it does not contain any information about the *"why?"* part of the question. When I write an answer, I try to actually answer the question. Sometimes successfully... – cmaster - reinstate monica Aug 26 '19 at 14:45
1

itoa isn't standard, so any library can declare it as it pleases. It is not very meaningful to ask for a rationale about functions that aren't standardized in the first place.

And those functions that are standardized mostly got into the standard by chance. They basically took every function available in Unix, wrote its name on a paper, tossed them in a hat and drew one hundred or so, whimsically. And so we got a bunch of diverse functions of varied quality and usefulness. I'd say the majority of them are either unsafe or bad style. No rationale exists.


As for the specific case:

The reason why fixed array pointers weren't used is obviously that most library functions in C, standard or not, work on null terminated strings with variable length. If some function would behave differently, it would stand out. At the point when C was launched, Unix was apparently trying to move away from fixed length strings to null-terminated strings.

Furthermore, the rules about pointer conversions were pretty much non-existing initially, so it probably wouldn't have added any safety to use array pointers at the point when all these functions were cooked up, back in the 1970s. There wasn't even a void pointer type.

Regarding buffer overruns and other error controls: when writing C, you can either place error controls in the function or the caller. The most common practice in C is to leave error handling to the caller, which is perfectly fine as long as it is documented. For this there exists a rationale, namely "the spirit of C", which is to always put performance first. Error handling costs performance. In many use-cases the caller knows the nature of the data in advance, making such error controls superfluous.

Lundin
  • 195,001
  • 40
  • 254
  • 396