3

I want to read everything that is on stdin after 10 seconds and then break. The code I've been able to write so far is:

#include <stdio.h>
#include <stdlib.h>

int main() {
  sleep(10);
  char c;
  while (1) { // My goal is to modify this while statement to break after it has read everything.
    c = getchar();
    putchar(c);
  }
  printf("Everything has been read from stdin");
}

So when the letter "c" is entered before the 10 seconds have elapsed, it should print "c" (after sleep is done) and then "Everything has been read from stdin".

So far I have tried:

  • Checking if c is EOF -> getchar and similar functions never return EOF for stdin
  • Using a stat-type function on stdin -> stat-ing stdin always returns 0 for size (st_size).
gurkensaas
  • 793
  • 1
  • 5
  • 29
  • *Checking if c is EOF -> `getchar` and similar functions never return `EOF` for `stdin`* That's because `getchar()` returns `int`, not `char`. Cramming the returned value into a `char` removes the ability to detect `EOF`. You need to change `char c;` to `int c;`. – Andrew Henle Nov 20 '22 at 12:17
  • @AndrewHenle Changing `char c;` to `int c;` and `while (1) {` to `while ((c = getchar()) != EOF) {` has not solved the problem for me. – gurkensaas Nov 20 '22 at 12:22
  • @AndrewHenle To clarify, I can now do `echo "hello world" | ./myprogram` and then it prints "hello world" and then "Everything has been read from stdin" but reading from `stdin` this way rather than user-input during the `sleep` period is not my goal. – gurkensaas Nov 20 '22 at 12:28
  • @user3121023 I know that the terminal is usually buffered. My question is, if I unbuffer it or press enter, how do I know there is nothing more to read? – gurkensaas Nov 20 '22 at 13:22
  • @user3121023 I would prefer a `termios` approach. Would you mind providing an example in an answer? – gurkensaas Nov 20 '22 at 14:06

2 Answers2

1

You can use the select function to wait to see if there is something to read on stdin with a timeout that starts at 10 seconds. When it detects something, you read a character and check for errors or EOF. If all is good, then you call select again, reducing the timeout by the elapsed time so far.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/select.h>
#include <sys/time.h>
#include <time.h>

struct timeval tdiff(struct timeval t2, struct timeval t1)
{
    struct timeval result;

    result.tv_sec = t2.tv_sec - t1.tv_sec;
    result.tv_usec = t2.tv_usec - t1.tv_usec;
    while (result.tv_usec < 0) {
        result.tv_usec += 1000000;
        result.tv_sec--;
    }
    return result;
}

int cmptimestamp(struct timeval t1, struct timeval t2)
{
    if (t1.tv_sec > t2.tv_sec) {
        return 1;
    } else if (t1.tv_sec < t2.tv_sec) {
        return -1;
    } else if (t1.tv_usec > t2.tv_usec) {
        return 1;
    } else if (t1.tv_usec < t2.tv_usec) {
        return -1;
    } else {
        return 0;
    }
}

int main()
{
    struct timeval cur, end, delay;
    int rval, len = 0;
    fd_set fds;

    gettimeofday(&cur, NULL);
    end = cur;
    end.tv_sec += 10;
    FD_ZERO(&fds);
    FD_SET(0, &fds);

    if (fcntl(0, F_SETFL, O_NONBLOCK) == -1) {
        perror("fcntl failed"); 
        exit(1);
    }
    do {
        delay = tdiff(end, cur);
        rval = select(1, &fds, NULL, NULL, &delay);
        if (rval == -1) {
            perror("select failed");
        } else if (rval) {
            char c;
            len = read(0, &c, 1);
            if (len == -1) {
                perror("read failed");
            } else if (len > 0) {
                printf("c=%c (%d)\n", c, c);
            } else {
                printf("EOF\n");
            }   
        } else {
            printf("timeout\n");
        }
        gettimeofday(&cur, NULL);
    } while (rval > 0 && len > 0 && cmptimestamp(end,cur) > 0);

    return 0;
}

Note that this doesn't detect the keys as you press them, only after you've either pressed RETURN or stdin is closed.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • This works great, I have some questions however: 1) How would this code look if the delay was separated into a value like `delayInSeconds` that maybe even accepted non-integer values? 2) What about non-blocking stdin? When I unblock my stdin, the characters get printed but the program never halts/"timeout" never gets printed. – gurkensaas Nov 24 '22 at 20:08
  • I think it's the right direction. however: 1 - You probably want to `break;` on error. 2 - nothing in the question says we can assume that after 10 seconds, is closed and we reached EOF - You also probably want to use reopen stdin with `O_NONBLOCK` `read(2)` instead of `getchar()` to avoid any stdio buffering effects. – root Nov 24 '22 at 20:12
  • @gurkensaas for a variable delay, put the seconds part in `delay.tv_sec` and the microseconds part in `delay.tv_usec`, and add the same amounts to `end` (you'll need to check if `end.tv_usec` is greater than 1000000 and if so subtract that amount and add 1 to `end.tv_sec`. If for unblocked stdin you mean having the input piped in, it will just read everything and stop when the end is reached. – dbush Nov 25 '22 at 03:14
  • @root The `rval > 0` check takes care of both a timeout and an error from `select`, and the `c != 1` check takes care of errors from `getchar`. – dbush Nov 25 '22 at 03:16
  • No, they do not. – root Nov 25 '22 at 04:41
  • If the input doesn't contain `\n`, then at some point `getchar()` will block past the 10 seconds. Or, if stdin is continuously written to (forever), the `while` loop might not terminate after 10 seconds or at all. If a context switch during `printf()` occurs during the 10 second mark, the next iteration of the `do` will pass a negative `delay` and print an error. If there's something in the stdio buffer after `\n` it will not be printed on the next iteration if `select()` hits an error/timeout. – root Nov 25 '22 at 22:42
  • @root ok, I hadn't considered the possible endless read on stdin, nor that streams other than a piped file or a terminal (i.e. a socket or some other process) could stop sending data without an EOF. Updated to set stdin to non-blocking and removed the inner loop, and also added a check for time expired before calling `select`. – dbush Nov 26 '22 at 00:41
1

Here's an offering that meets my interpretation of your requirements:

  • The program reads whatever data is typed (or otherwise entered) on standard input in a period of 10 seconds (stopping if you manage to enter 2047 characters — which would probably mean that the input is coming from a file or a pipe).
  • After 10 seconds, it prints whatever it has collected.
  • The alarm() call sets an alarm for an integral number of seconds hence, and the system generates a SIGALRM signal when the time is up. The alarm signal interrupts the read() system call, even if no data has been read.
  • The program stops without printing on receiving signals.
  • If the signal is one of SIGINT, SIGQUIT, SIGHUP, SIGPIPE, or SIGTERM, it stops without printing anything.
  • It fiddles with the terminal settings so that the input is unbuffered. This avoids it hanging around. It also ensures that system calls do not restart after a signal is received. That may not matter on Linux; using signal() on macOS Big Sur 11.7.1, the input continued after the alarm signal, which was not helpful — using sigaction() gives you better control.
  • It does its best to ensure that the terminal mode is restored on exit, but if you send an inappropriate signal (not one of those in the list above, or SIGALRM), you will have a terminal in non-canonical (raw) mode. That leads to confusion, in general.
  • It is easy to modify the program so that:
    • input is not echoed by the terminal driver;
    • characters are echoed by the program as they arrive (but beware of editing characters);
    • signals are not generated by the keyboard;
    • so it doesn't futz with standard input terminal attributes if it is not a terminal.

Code

/* SO 7450-7966 */
#include <ctype.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <termios.h>
#include <unistd.h>

#undef sigemptyset      /* MacOS has a stupid macro that triggers -Wunused-value */

static struct termios sane;

static void stty_sane(void)
{
    tcsetattr(STDIN_FILENO, TCSANOW, &sane);
}

static void stty_raw(void)
{
    tcgetattr(STDIN_FILENO, &sane);
    struct termios copy = sane;
    copy.c_lflag &= ~ICANON;
    tcsetattr(STDIN_FILENO, TCSANOW, &copy);
}

static volatile sig_atomic_t alarm_recvd = 0;

static void alarm_handler(int signum)
{
    signal(signum, SIG_IGN);
    alarm_recvd = 1;
}

static void other_handler(int signum)
{
    signal(signum, SIG_IGN);
    stty_sane();
    exit(128 + signum);
}

static int getch(void)
{
    char c;
    if (read(STDIN_FILENO, &c, 1) == 1)
        return (unsigned char)c;
    return EOF;
}

static void set_handler(int signum, void (*handler)(int signum))
{
    struct sigaction sa = { 0 };
    sa.sa_handler = handler;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = 0;    /* No SA_RESTART! */
    if (sigaction(signum, &sa, NULL) != 0)
    {
        perror("sigaction");
        exit(EXIT_FAILURE);
    }
}

static void dump_string(const char *tag, const char *buffer)
{
    printf("\n%s [", tag);
    int c;
    while ((c = (unsigned char)*buffer++) != '\0')
    {
        if (isprint(c) || isspace(c))
            putchar(c);
        else
            printf("\\x%.2X", c);
    }
    printf("]\n");
}

int main(void)
{
    char buffer[2048];

    stty_raw();
    atexit(stty_sane);
    set_handler(SIGALRM, alarm_handler);
    set_handler(SIGHUP, other_handler);
    set_handler(SIGINT, other_handler);
    set_handler(SIGQUIT, other_handler);
    set_handler(SIGPIPE, other_handler);
    set_handler(SIGTERM, other_handler);
    alarm(10);

    size_t i = 0;
    int c;
    while (i < sizeof(buffer) - 1 && !alarm_recvd && (c = getch()) != EOF)
    {
        if (c == sane.c_cc[VEOF])
            break;
        if (c == sane.c_cc[VERASE])
        {
            if (i > 0)
                i--;
        }
        else
            buffer[i++] = c;
    }
    buffer[i] = '\0';

    dump_string("Data", buffer);
    return 0;
}

Compilation:

gcc -O3 -g -std=c11 -Wall -Wextra -Werror -Wmissing-prototypes -Wstrict-prototypes -fno-common tensec53.c -o tensec53 

No errors (or warnings, but warnings are converted to errors).

Analysis

  • The #undef line removes any macro definition of sigemptyset() leaving the compiler calling an actual function. The C standard requires this to work (§7.1.4 ¶1). On macOS, the macro is #define sigemptyset(set) (*(set) = 0, 0) and GCC complains, not unreasonably, about the "right-hand operand of comma expression has no effect". The alternative way of fixing that warning is to test the return value from sigemptyset(), but that's arguably sillier than the macro. (Yes, I'm disgruntled about this!)
  • The sane variable records the value of the terminal attributes when the program starts — it is set by calling tcgetattr() in stty_raw(). The code ensures that sane is set before activating any code that will call sttr_sane().
  • The stty_sane() function resets the terminal attributes to the sane state that was in effect when the program started. It is used by atexit() and also by the signal handlers.
  • The stty_raw() function gets the original terminal attributes, makes a copy of them, modifies the copy to turn off canonical processing (see Canonical vs non-canonical terminal input for more details), and sets the revised terminal attributes.
  • Standard C says you can't do much in a signal handler function than set a volatile sig_atomic_t variable, call signal() with the signal number, or call one of the exit functions. POSIX is a lot more gracious — see How to avoid using printf() in a signal handler? for more details.
  • There are two signal handlers, one for SIGALRM and one for the other signals that are trapped.
  • The alarm_handler() ignores further alarm signals and records that it was invoked.
  • The other_handler() ignores further signals of the same type, resets the terminal attributes to the sane state, and exits with a status used to report that a program was terminated by a signal (see POSIX shell Exit status for commands).
  • The getch() function reads a single character from standard input, mapping failures to EOF. The cast ensures that the return value is positive like getchar() does.
  • The set_handler() function uses sigaction() to set the signal handling. Using signal() in the signal handlers is a little lazy, but adequate. It ensures that the SA_RESTART bit is not set, so that when a signal interrupts a system call, it returns with an error rather than continuing.
  • The dump_string() function writes out a string with any non-printable characters other than space characters reported as a hex escape.
  • The main() function sets up the terminal, ensures that the terminal state is reset on exit (atexit() and the calls to set_handler() with the other_handler argument), and sets an alarm for 10 seconds hence.
  • The reading loop avoids buffer overflows and stops when the alarm is received or EOF (error) is detected.
  • Because canonical processing is turned off, there is no line editing. The body of the loop provides primitive line editing — it recognizes the erase (usually backspace '\b', sometimes delete '\177') character and the EOF character and handles them appropriately, otherwise adding the input to the buffer.
  • When the loop exits, usually because the alarm went off, it null terminates the string and then calls dump_string() to print what was entered.
  • If you wanted sub-second intervals, you would need to use the POSIX timer_create(), timer_delete(), timer_settime() (and maybe timer_gettime() and timer_getoverrun()) functions, which take struct timespec values for the time values. If they're not available, you might use the obsolescent setitimer() and getitimer() functions instead. The timer_create() step allows you to specify which signal will be sent when the timer expires — unlike alarm() and setitimer() which both send pre-determined signals.

POSIX functions and headers:

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • This code is available in my [SOQ](https://github.com/jleffler/soq) (Stack Overflow Questions) repository on GitHub as file `tensec53.c` in the [src/so-7450-7966](https://github.com/jleffler/soq/tree/master/src/so-7450-7966) sub-directory. – Jonathan Leffler Nov 25 '22 at 18:21
  • One thing I would add to the analysis: the presence of the alarm signal is what unblocks the `read()` call in `getch()`. I would also set `MIN` and/or `TIME`, rather than assume the default terminal special chars. – root Nov 25 '22 at 23:15
  • @root — thanks; I've added some emphasis on the role of `alarm()`, and also added information about sub-second timing for alarms with `timer_settime()` et al (or, using an obsolescent function, with `setitimer()`). – Jonathan Leffler Nov 26 '22 at 03:08