14

Functions such as strcpy(), malloc(), strlen() and various others accept their arguments or return values as a size_t instead of a int or an unsigned int for obvious reasons.

Some file functions such as fread() and fwrite() use size_t as well. By extension, it would be expected that char* fgets (char *str, int num, FILE *stream) should use a size_t and not an int as argument for its buffer size.

However, fgets() uses an int. Is there any objective explanation why?

Sinan Ünür
  • 116,958
  • 15
  • 196
  • 339
  • 12
    There is no consistency in C standard library. Functions have evolved over time. `size_t` might not have existed when `fgets` was introduced (just my guess). – user694733 Aug 02 '16 at 12:00
  • 1
    https://groups.google.com/forum/#!topic/comp.std.c/FnENEIlCidg – artm Aug 02 '16 at 12:01
  • 2
    Always dazed to see this kind of questions so upvoted. I agree with @P.P.: it is OT. – LPs Aug 02 '16 at 12:16
  • 1
    @P.P. So asking a question about why a function deviates from a certain pattern seen in others is "opinion-based"? Where am I looking for people's opinions? –  Aug 02 '16 at 12:16
  • 3
    @user2064000 If there is very unlikely to be any real reason and citation - as seems to be the case here - then people will just guess wildly based on their own imaginations, which gets us nowhere. That doesn't necessarily mean you're being implicated as _asking for_ that, but that's what's likely to result from the question in reality. – underscore_d Aug 02 '16 at 12:17
  • 6
    It's not opinion oriented: K&R defined `fgets()` on p.155 with an `int` argument. Their code would have worked with an `unsigned int` as well. `size_t` got introduced later, in C89 (ANSI C), as the type of `sizeof()`. So memory management functions were updated. But file I/O wasn't: the only file functions that use `size_t`are those introduced by C89 and did not exist in K&R (example: fread()/fwrite(): K&R used only unix read/write on file descriptors for bloc operations, no fread/fwrite at that time ) – Christophe Aug 02 '16 at 13:09
  • @Christophe Fantastic! I'd suggest posting that as a canonical answer - along with reopening the thread to suit. I had tried to find chronological origins of the two but hadn't gotten very far. (i.e. I don't have K&R and wasn't sure whether `size_t` existed before C89.) – underscore_d Aug 02 '16 at 13:30
  • @underscore_d ok, if you think it can help. I've nominated the question for reopening but a couple of other people have to do so as well. – Christophe Aug 02 '16 at 13:36

1 Answers1

11

The original K&R defined fgets() on p.155 with an int argument. The code presented in the book would have worked with an unsigned int as well (it uses a >0, but the loop is written so to never go below zero).

size_t got introduced later, in C89 (ANSI C), as the type of sizeof(). As this feature was specifically introduced for harmonizing memory allocation, memory management functions and string functions were updated accordingly. But file I/O wasn't: the only file functions that used size_t in C89 are those new ones introduced by C89 and did not exist in K&R such as for example fread()/fwrite(). Yes, K&R didn't have these functions and relied for bloc operations only on (non portable) unix read/write functions using file descriptors.

It shall be noted that the POSIX standard, which has harmonized the unix functions, was developed in parallel to the ANSI C standard and issued late 1988. This standard has harmonized many unix functions to use size_t so that read()/write() nowadays are defined with size_t. But for the C standard library functions such as fgets(), POSIX gives precedence to the C standard (wording of the current version of the standard):

The functionality described on this reference page is aligned with the ISO C standard. Any conflict between the requirements described here and the ISO C standard is unintentional.

So in POSIX also, ironically, fgets() still inherited from its historical K&R int.


Edit: additional reading

stdio.h: This header defines and prototypes most of the functions listed in Chapter 7 of K&R. Few, if any, changes were made in the definitions found in K&R but several new functions have been added.

Community
  • 1
  • 1
Christophe
  • 68,716
  • 7
  • 72
  • 138
  • Do you have any idea whether any notable pre-C89 implementations of fgets behaved predictably when passed a negative value [e.g. predictably treating it like zero, predictably doing nothing, reading -N bytes of data but omitting the newline, etc.]? If so, changing to size_t could have forbidden implementations from supporting that feature unless the length was restricted to SIZE_T_MAX-UINT_MAX [from what I can tell, passing a length larger than the buffer size is defined behavior provided a newline successfully read before the end of the buffer]. – supercat Aug 02 '16 at 15:56
  • 1
    K&R's implementation dates from 1978. As it is signed integer and the first looping condition is `--n>0` it would stop and put the null terminator in the buffer. So completely predictable. If the parameter was unsigned, the same code could produce a buffer overflow, but would stop after maximum UINT_MAX iteration (unless hardware would capture the flipping of the sign bit). Then there could have been other implementations as well: pre 89 there were plenty of compiler maker, most of which are no longer known today ;-). – Christophe Aug 02 '16 at 16:16
  • @supercat I could find back my former Lattice C compiler of 1986 ([acquired by Microsoft](https://en.wikipedia.org/wiki/Lattice_C)) with source of its standard library. `fgets()` is implemented with `for(i=0; i – Christophe Aug 02 '16 at 16:26
  • 4
    If the value were larger than the size of the buffer, but a newline was received before a buffer overrun, would behavior be defined as reading the data up to the newline? If, so, that would mean that if the `size` were changed to a `size_t`, then `fgets(buff, -1, file)` would have a *defined* behavior different from pre-existing behaviors. The authors of the Standard had no objection to labeling things that were defined various ways in different implementations as Undefined Behavior, since they didn't think that would discourage anyone from continuing to support... – supercat Aug 02 '16 at 16:51
  • 3
    ...the kinds of useful behaviors that were already in use [such expectation held throughout the twentieth century, but has since come under attack], but were loath to define behaviors which any implementation might define in contrary fashion. – supercat Aug 02 '16 at 16:56