1

My program takes in files with arbitrarily long lines. Since I don't know how much characters would be on a line, I would like to print the whole line to stdout, without malloc-ing an array to store it. Is this possible?

I am aware that it's possible to print these lines one chunk at a time-- however, the function doing the printing would be called very often, and I wish to avoid the overhead of malloc-ing arrays that hold the output, in every single call.

Alex
  • 947
  • 1
  • 8
  • 25
  • I would recommend looking at the source code of various `cat` implementations to see how they `concat` / `write` the contents of a file onto the screen. – Felix Guo Jul 19 '17 at 22:41
  • 1
    `char buff[BUFSIZ]; while(fgets(buff, sizeof buff, fp)) fputs(buff, stdout);` ? – BLUEPIXY Jul 19 '17 at 22:41
  • @BLUEPIXY That's the point-- in my post I stated that I wish to avoid casting char buff[BUFSIZ] in every single call-- because this function gets called a lot, and casting and removing an array on the stack is very expensive. – Alex Jul 19 '17 at 22:43
  • 2
    I doubt it's very expensive to create an array on the stack, but if that's your concern, make the array static or global. – Retired Ninja Jul 19 '17 at 22:44
  • 1
    You only need to create an array once for each file you print out. Is it really that much of an overhead? – r3mainer Jul 19 '17 at 22:45
  • 1
    @Alex allocate and deallocate are not done every call. – BLUEPIXY Jul 19 '17 at 22:46
  • @squeamishossifrage No-- once for every call to the function that does the printing-- or am I missing something? – Alex Jul 19 '17 at 22:46
  • 2
    Yes, if you create the array inside a function it's allocated on the stack each time you call the function. The allocation is basically free though, the stack pointer will be moved whether you have the array inside the function or not, so adding it just changes how far the pointer moves. – Retired Ninja Jul 19 '17 at 22:48
  • @RetiredNinja Thanks! Both of your comments would work as valid solutions to my problem. I was not aware exactly how static variables worked; and, I did not consider that stack allocation is just moving the stack pointer further. – Alex Jul 19 '17 at 22:51
  • You can't not store *anything* in memory, but you don't have to store a whole line in memory at a time. – user253751 Jul 19 '17 at 22:52

2 Answers2

1

First of all you can't print things that's not exist, means that you have to store it somewhere, either in the stack or heap. If you use FILE* then libc will do it for you automatically.

Now if you use FILE*, you can use getc to get an ASCII character a time, check if the character is a newline character and push it to stdout.

If you's using file descriptor, you can read a character a time and do exactly the same thing.

Both approaches does not require you explicitly allocate memory in the heap.

Now if you use mmap, you can perform some strtok family function and then print the string to stdout.

Yuji
  • 156
  • 1
  • 5
  • The suggestion of mmap deserves the +1. – Alex Jul 19 '17 at 22:52
  • 2
    @Alex At least on Linux, mmaping the file for reading is a pesimissation if the file is less than 16KiB. Check out my benchmark: https://stackoverflow.com/a/39196499/1084774 – Petr Skocik Jul 19 '17 at 23:02
  • 1
    @PSkocik Sounds about right. mmap is pretty expensive, in general. – Alex Jul 19 '17 at 23:03
0

takes in files with arbitrarily long lines ... print the whole line to stdout, without malloc-ing an array to store it. Is this possible?

In general, for arbitrary long lines: no.

A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character. C11dr §7.21.2 2

The length of a line is not limited to SIZE_MAX, the longest array possible in C. The length of a line can exceed the memory capacity of the computer. There is just no way to read arbitrary long lines. Simply code could use the following. I doubt it will be satisfactory, yet it does print the entire contents of a file with scant memory.

// Reads one character at a time.
int ch;
while((ch = fgetc(fp)) != EOF) {
  putchar(ch);
}

Instead, code should set a sane upper bound on line length. Create an array or allocate for the line. As much as a flexible long line is useful, it is also susceptible to malicious abuse by a hacker exploit consuming unrestrained resources.

#define LINE_LENGTH_MAX 100000
char *line = malloc(LINE_LENGTH_MAX + 1);
if (line) {
  while (fgets(line, LINE_LENGTH_MAX+1, fp)) {
    if (strlen(line) >= LINE_LENGTH_MAX) {
      Handle_Possible_Attach();
    }
    foo(line);  // Use line
  }
  free(line);
)
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256