1

first I'm looking for optimization, fast time execution

I would like to read data from input in C so here is my code (Linux)

int main(void) {
    char command_str[MAX_COMMAND_SIZE];

    while (!feof(stdin)) {
        fgets(command_str, MAX_COMMAND_SIZE, stdin);
        // Parse data
    }
    return EXIT_SUCCESS;
}

According to this post Read a line of input faster than fgets? read() function seems to be the solution.

The data input is like:

100 C
1884231 B
8978456 Z
...

From a file, so I execute my program like ./myapp < mytext.txt

It is not possible to know how many entries there is, it's could be 10, 10000 or even more.

From this post

Drop all the casts on malloc and realloc; they aren't necessary and clutter up the code

So if I use a dynamic array my app will be slower I think.

The idea is:

  • Read the whole input in one go into a buffer.

  • Process the lines from that buffer.

  • That's the fastest possible solution.

If someone would help me. Thanks in advance.

Community
  • 1
  • 1
  • 4
    please, before you go on, read [Why is “while ( !feof (file) )” always wrong?](https://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong) –  Jun 24 '17 at 14:47
  • 1
    Yes, reading in memory is probably (test?) faster if file size is less than available memory. However...why do you think you have to optimize `fgets` when there are good chances parsing introduces a bigger delay? Performance are not enough? Are you already done with everything else, you profiled your code, and fgets is what is slowing down your code? – Adriano Repetti Jun 24 '17 at 14:49
  • 3
    And [`mmap()`](https://linux.die.net/man/2/mmap) might be *even faster*. Still, I second @AdrianoRepetti here -- *did* you identify a performance problem? If not, this is *premature optimization* you should avoid. –  Jun 24 '17 at 14:55
  • @AdrianoRepetti I would like to get all stdin input data in one time, without a loop with fgets(). Do you think it's possible? Thank you – mastermassi Jun 26 '17 at 09:01
  • Yes, of course. Lowest level thing you can use (remaining compatible with pipes and terminals) is simply `fread()` (using a buffer big enough and handling multiple reads). Unfortunately you do not know (and you can't easily determine) file size for piped input then you need to `realloc()` if input size is bigger than your buffer. Honestly there are chances to make it wrong and the **gain** compared to plain `fgets()` is **risible** (if any). Let me be pedantic: ARE YOU SURE THE "BOTTLENECK" of your app is `fgets()` (note that I/O probably is but it won't vanish because you drop `fgets()`) – Adriano Repetti Jun 26 '17 at 09:37
  • Because there are chances that your app will benefit a bigger performance gain if you invest 1 hour of your time in a faster parsing. Side note: if you accept a file name instead if input from `stdin` then it's easy to determine the file size. – Adriano Repetti Jun 26 '17 at 09:38
  • @mastermassi: you can accept one of the answers by clicking on the grey checkmark below its score – chqrlie Jul 04 '17 at 16:05

2 Answers2

1

while (!feof(f)) is always wrong. Use this instead:

#include <stdio.h>

int main(void) {
    char command_str[MAX_COMMAND_SIZE];

    while (fgets(command_str, MAX_COMMAND_SIZE, stdin)) {
        // Parse data
    }
    return EXIT_SUCCESS;
}

Reading file contents faster than fgets() is feasible, but seems beyond your skill level. Learn the simple stuff first. There is an awful lot that can be achieved with standard line by line readers... Very few use cases warrant the use of more advanced approaches.

If you want to read the whole input and parse it as a single string, here is a generic solution that should work for all (finite) input types:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
    size_t pos = 0, size = 1025, nread;
    char *buf0 = malloc(size);
    char *buf = buf0;

    for (;;) {
        if (buf == NULL) {
            fprintf(stderr, "not enough memory for %zu bytes\n", size);
            free(buf0);
            exit(1);
        }
        nread = fread(buf + pos, 1, size - pos - 1, stdin);
        if (nread == 0)
            break;

        pos += nread;
        /* Grow the buffer size exponentially (Fibonacci ratio) */
        if (size - pos < size / 2)
            size += size / 2 + size / 8;
        buf = realloc(buf0 = buf, size);
    }
    buf[pos] = '\0';

    // parse pos bytes of data in buf as a string
    printf("read %zu bytes\n", strlen(buf));        

    free(buf);
    return EXIT_SUCCESS;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • First, thank you for the answer. I would like to get stdin input but once (all data in on time for optimization). It is possible without doing a loop with fgets()? – mastermassi Jun 26 '17 at 08:57
  • You should first write the program with this simple approach. Optimizing the I/O is something probably not worth the effort. I you want to read the whole input at once, it is feasible only for regular files read in binary mode. Getting the whole input from the terminal or a pipe requires incremental reading. I shall post an alternate version that does that. – chqrlie Jun 26 '17 at 12:25
0

Maybe you could use fseek (stdin, 0, SEEK_END) to go to the end of the standard input stream, then use ftell (stdin) to get its size in bytes, then allocate memory to save all that in a buffer and then process it's contents.

NAD
  • 33
  • 1
  • 6
  • `memcpy` copies memory, it does NOT read from a stream (like `stdin`). –  Jun 24 '17 at 15:03
  • 1
    Seeking to the end would not work for streams attached to the terminal or coming from a pipe. For the same reason, `mmap` is not a generic solution either. – chqrlie Jun 24 '17 at 15:09