I want to mmap stdin. But I can't mmap stdin.
So I have to call read
and realloc
in a loop, and I want to optimize it by choosing a good buffer size.
fstat
of the file descriptor 0 gives a struct stat
with a member named st_size
, which seems to represent the amount of bytes that are already in the pipe buffer corresponding to stdin.
Depending on how I call my program, st_size
varies between 0 and the full pipe buffer size of stdin, which is about 65520. For example in this program:
#include <stdio.h>
#include <sys/stat.h>
#include <unistd.h>
int main()
{
struct stat buf;
int i;
for (i=0; i<20; i++)
{
fstat(0, &buf);
printf("%lld\n", buf.st_size);
usleep(10);
}
}
We can observe the buffer is still being filled:
$ cc fstat.c
$ for (( i=0; i<10000; i++ )) do echo "0123456789" ; done | ./a.out
9196
9306
9350
9394
9427
9471
9515
9559
...
And the output of this program changes everytime I re-run it.
I'd like to use st_size
for the initial buffer size so I have to do less calls to realloc
.
I have three questions, most important one first:
Can I 'flush' stdin, i.e. wait until the buffer is not being filled anymore ?
Could I do better ? Maybe there is another way to optimize this, to make the
realloc
in a loop make feel less dirty.Can you confirm I can't use mmap with stdin ?
Few details:
I wanted to use mmap because I already wrote a program, for educational purpose, that does what
nm
does, using mmap. I'm wondering how it does handle stdin.I know that
st_size
can be 0, in which case I will have no choice but to define a buffer size.I know
stdin
can give more bytes than the pipe buffer size.I want to optimize for educational purpose. There is no imperative need to it.
Thank you