2

The standard library function fgets() has two downsides:

  • The second parameter of the function is of type int
  • It leaves a trailing new-line character in the supplied buffer

I have made a simple function similar to fgets() excluding the above-mentioned downsides, to try to improve the efficiency of one of my programs that fetches the lines of a text file and terminates the char array at the new-line character using the function strcspn().

But is it really more efficient? Is there any reason why the standard library function has advantage over the following naive implemention?

#include <stdio.h>


char *my_fgets(char *buf, size_t maxCount, FILE *stream);


int main(int argc, char **argv)
{
    if (argc < 2)
    {
        fprintf(stderr, "Usage: %s [filename]", argv[0]);
    }

    FILE *fp;
    fp = fopen(argv[1], "r");

    if (!fp)
    {
        perror(argv[1]);
        return 1;
    }

    char buf[256];
    /*while (fgets(buf, sizeof(buf), fp))
    {
        buf[strcspn(buf, "\n")] = '\0';
        // . . .
        puts(buf);
    }*/

    while (my_fgets(buf, sizeof(buf) - 1, fp))
    {
        puts(buf);
    }

    return 0;
}


char *my_fgets(char *buf,
    size_t maxCount, FILE *stream)
{
    int ch;
    size_t n = 0;

    while ((ch = fgetc(stream)) != EOF)
    {
        if (ch == '\n' || n == maxCount)
        {
            break;
        }
        else
        {
            buf[n++] = ch;
        }
    }
    if (n == 0 && ch == EOF)
    {
        return NULL;
    }
    buf[n] = '\0';
    return buf;
}
machine_1
  • 4,266
  • 2
  • 21
  • 42
  • 4
    This question seems like it might be better suited to [Code Review](https://codereview.stackexchange.com/) – ad absurdum Feb 24 '18 at 11:38
  • @DavidBowling This question is not about improving code. It is about the reasons why my function might be less efficient than the original function. – machine_1 Feb 24 '18 at 11:45
  • One way to answer this question is to dump the assembly of both fgets, and compare or profile (or time) both on sufficiently large input. – David C. Rankin Feb 24 '18 at 11:50
  • 1
    I think that your question [would be on topic there](https://codereview.stackexchange.com/help/on-topic): "open-ended feedback in the following areas: Best practices and design pattern usage, Security issues, Performance, Correctness in unanticipated cases" – ad absurdum Feb 24 '18 at 11:51
  • 3
    One reason it might be less efficient is that the standard library version might make use of lower level OS specific I/O functions, rather than using `fgetc()`, and may be implemented using hand-crafted assembler tuned for performance on the target system. The standard only specifies the net effect of `fgets()` - it doesn't require that `fgets()` be implemented using `fgetc()` or even that it be written in C. – Peter Feb 24 '18 at 11:53
  • You can also just check -- presuming you are using `gcc`, then [iofgets.c](https://sourceware.org/git/?p=glibc.git;a=blob;f=libio/iofgets.c;h=68177dbe22d3a3463291c3f25cf7188cc9ad4af1;hb=refs/heads/master). If you are using a closed-source compiler -- you are out of luck. – David C. Rankin Feb 24 '18 at 11:59
  • Coincidently yesterday a bounty was added to [fgets() Alternative](https://codereview.stackexchange.com/q/179201/29485) at [CR](https://codereview.stackexchange.com/) – chux - Reinstate Monica Feb 24 '18 at 12:05
  • 3
    Your function is error prone: It writes `maxCount + 1` bytes to the buffer, unlike `fgets`. If the `maxCount` limit is reached, it drops characters: If the stream contains `hello` and you call `my_fgets(buf, 1, fp)`, it stores `"h"` in `buf` and discards `'e'`, leaving `llo` in the stream. And there is no way to distinguish a complete line (an actual end-of-line (`\n`) was reached) from a partial line (`my_fgets` stopped because `maxCount` characters were read). – melpomene Feb 24 '18 at 12:08
  • @chux I pass `sizeof(buf) - 1` to the function. – machine_1 Feb 24 '18 at 12:11
  • `my_fgets()` differs from `fgets()` when an _input error_ occurs. This code does not always return `EOF`. – chux - Reinstate Monica Feb 24 '18 at 12:12
  • @melpomene Does `fgets()` distinguish between complete lines and partial lines? – machine_1 Feb 24 '18 at 12:22
  • @chux I think your name is about as descriptive as it could be. No doubt it will read the entire line, and only store up to the desired number of chars in that one string. No chance of multiple reads with a fixed buffer of less than the length of line. Handy little fellow. – David C. Rankin Feb 24 '18 at 12:22
  • 1
    @machine_1 1) passing `sizeof(buf) - 1` will create application errors given this function is "similar to fgets()". 2) A need for a "subtract 1" for a `size_t` argument incurs risks/work when the size of the allocated `buf` is 0. 3) Other C library functions do not follow this precedent. Not `sprintf(), strncpy(), strftime()`.... – chux - Reinstate Monica Feb 24 '18 at 12:23
  • 3
    @machine_1 - yes, you either have a trailing `'\n'` or you have read the MAX number of characters specified and there is no `'\n'` at the end of your buffer. – David C. Rankin Feb 24 '18 at 12:23
  • @DavidC.Rankin Well, I said it was naive implementation, didn't I? :-) – machine_1 Feb 24 '18 at 12:35

2 Answers2

1

Performance

Your use of fgetc() to fetch characters one by one guarantees that my_fgets() cannot compete with a sensibly optimized solution. To get good performance, you need to read some characters into a buffer via the read() syscall, and then consume the data directly from that buffer. Performing a full function call for every single byte is a lot of overhead.

Safety

fgets() is not safe for general usage: It will split long lines. This condition is generally unanticipated (your example code does not deal with this either, for instance), and will generally lead to wrong behavior when very long input lines are supplied to your program. The presence of the trailing \n could actually help you dealing with that problem. If you want safe software, you must explicitly handle the insufficient buffer case when using fgets(). Fail to do so, and you definitely have a bug waiting to strike. Your implementation does not even attempt to address this problem.

That is why I would strongly suggest using input functions that can deal with input lines of any lengths. On POSIX-compliant systems that would be getline(): This function allocates a sufficiently large buffer for you, so your only limit is available RAM. If getline() is not available on your platform, I would recommend to reimplement its functionality instead of a "somewhat better" version of fgets().

cmaster - reinstate monica
  • 38,891
  • 9
  • 62
  • 106
0

is it really more efficient?

No - not more efficient in performance. my_fgets(), at best, could have similar efficiency, yet even that is unlikely as fgets() has access to crafted assembly code and my_fgets() does not @Peter.

Yes, it may have a coding efficiency if the function goal is to get a line of user input and not save the '\n' as this is less to code than fgets() ... code_to_rid_\n()


Yet efficiency is secondary to functionality.

my_fgets() has minor trouble with input errors.

my_fgets() can drop characters, unlike fgets() @melpomene

The role of size_t maxCount differs by 1 from the similar argument of fgets().


Good try as the standard library fgets() has various weaknesses obliging coders to try for a better low level input line function.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • Fixed minor errors, if you'd like to check it out. https://ideone.com/LXdyZh, by the way, `fgets()` doesn't return NULL if size 1 is passed, which may cause a while loop to loop indefinitely. – machine_1 Feb 24 '18 at 13:15
  • 1
    @machine_1 A good place for code review is [CR](https://codereview.stackexchange.com). See [How to get the best value out of Code Review](https://codereview.meta.stackexchange.com/questions/2436/how-to-get-the-best-value-out-of-code-review-asking-questions) `fgets()` should not typically return `NULL` when 1 is passed. With `fgets()` and `n <2`, results are [troublesome](https://stackoverflow.com/questions/23388620/is-fgets-returning-null-with-a-short-buffer-compliant). – chux - Reinstate Monica Feb 24 '18 at 13:20