1

EDIT
The solution to the problem was understanding what Ctrl-D was actually doing.
On a new empty line, a single Ctrl-D will signal EOF.
But if there are characters already in the line, the first Ctrl-D causes the contents of the line to be echoed to the screen (but not written to STDOUT). With characters already in the buffer, a second Ctrl-D must be issued to signal EOF, thus writing the buffer to STDOUT.
This can be demonstrated by redirecting output to a file.
EDIT

I'm using fgetc() to read input from stdin. I loop until I receive an EOF. In the loop I build a string based on the characters typed before Ctrl-D was pressed. But I can't figure out a way of exiting the loop since the buffer ch = fgetc() reads from does not contain the EOF. (The EOF only triggers the fgetc() to return its first value.)

ungetc() does not allow pushing an EOF into the buffer, pushing any other char runs the risk of confusion with real data, I'm stuck!! I've read a LOT of answers but they don't address this issue or don't apply to the use-case I'm trying to implement.

I would like to be able to count, peek, etc on the stdin buffer.

I don't really want to read a whole line (or X chars at a time) because I'm processing each character as it arrives (edit) from fgetc().

Any suggestions on how to overcome this dilemma? (Without using NCurses)

I'm using Ubuntu. EOF = Ctrl-D Here is some code I'm working with:

This works, and does the same as Jonathan's simple example, but not what I want:

int main(int argc, char **argv) {

    int inputChr;

    do {
        inputChr = fgetc(stdin);
        if (inputChr != EOF) {
            fputc( inputChr, stdout);
        }
        if (feof(stdin)) {
            if (ferror(stdin)) {
                perror(NULL);
                return errno;
            }
        }
    } while (inputChr != EOF);
    return EXIT_SUCCESS;
}

HOWEVER, this is getting stuck but is trying to do what I want (edit) but requires Ctrl-D a second time:

char *buildLine (FILE *inputSource, char *currLine, int showTabs, int showNonPrint, int *haveLF) {

    int inputChr;
    char *thisLine = malloc(1);
    int inputSize;

    *haveLF = FALSE;
    while ( (inputChr = fgetc(inputSource)) != EOF ) {

        if (ferror(inputSource)) {
            perror(NULL);
        } else {
            if (inputChr == LF) {
                *haveLF = TRUE;
            } else {
                thisLine = strconcat(thisLine,(char *)&inputChr);
            }
        }
    }

    return thisLine;
}

Some more code that's been asked about:

char * strconcat ( char *str1, char * str2) {

    char *newStr = malloc(strlen(str1)+strlen(str2)+1);
    if (newStr == NULL) {
        return NULL;
    }
    strcpy(newStr,str1);
    strcat(newStr,str2);

    return newStr;
}

THIS VERSION BELOW processes the input character by character and works just like cat. But I decided I would process each character into a line first, before applying some extra transforms I need to implement. This simplified the state-machine design, but maybe trying to build lines wasn't good option (without using NCurses). :(

int echoInput( FILE *inputSource, FILE *outputDestination, int numbers, int showEnds) {

    int haveNewLine = TRUE;
    int lineNo = 1;
    int inputChr;

    do {
        inputChr = fgetc(inputSource);
        if (inputChr != EOF) {
            if (numbers && haveNewLine) {
                long lineNoSize = (long) log10(lineNo)+1;   // effectively floor(log10(lineNo)+1) = number digits
                char *lineNoStr =  (lineNoSize<6)?malloc(8):malloc(lineNoSize+2);   // If less than 6 digits, allow for minimum 6 plus tab.  Also +1 for terminator.
                if (lineNoStr == NULL) {
                    printf ("Error::Out of Memory");
                    return ENOMEM;
                }
            sprintf(lineNoStr,"%6d\t",lineNo);  // format lineNo string
                fputs(lineNoStr, outputDestination);    // send string to output
                lineNo++;
                haveNewLine = FALSE;
            }
            if (inputChr == LF) {
                if (showEnds) {
                    fputc('$', outputDestination);  // send char to output
                }
                haveNewLine = TRUE;
            }
            fputc( inputChr, outputDestination);
        }
        if (feof(inputSource)) {
            if (ferror(inputSource)) {
                perror(NULL);
                return errno;
            }
        }
        if (ferror(outputDestination)) {
            perror(NULL);
            return errno;
        }
    } while (inputChr != EOF);
    return EXIT_SUCCESS;
}
gone
  • 1,079
  • 5
  • 13
  • 31
  • Which OS are you using? – user3386109 Mar 24 '14 at 05:29
  • Make sure `ch` is an `int` if you intend to check it for `EOF`. – M.M Mar 24 '14 at 05:30
  • Typically in Linux, pressing Ctrl-D after typing some characters causes the input stream to be flushed (so your program can start reading it) but does not end the input. Pressing it again , or pressing it straight after a newline , causes the stream to end. – M.M Mar 24 '14 at 05:31
  • You can't "process each character as it arrives" using Standard C streams, as they are line-buffered streams. There's no way to check if more characters are waiting, but not block if they aren't. You'll have to make non-standard system calls or use a library such as ncurses instead. Some OS's allow you to use `setvbuf()` for this. – M.M Mar 24 '14 at 05:36
  • 3
    @MattMcNabb: `setvbuf` is part of Standard C, and it will work perfectly if `stdin` is, for example, a FIFO. With Unix systems, at least, it won't work if `stdin` is a tty but that's neither required nor prohibited by the C standard; not because the stdin is line-buffered but because by default the tty device itself doesn't return anything to userland until an ENTER or certain other special character is typed. – rici Mar 24 '14 at 05:43
  • Maybe you need to show some code that demonstrates the problem you're having. It is not clear yet what your problem is. People have been writing programs to read one character at a time, or to read lines at a time, or to read arbitrary size chunks of data at a time, all without any particular problem. What are you trying to do that no-one else has ever done successfully before? – Jonathan Leffler Mar 24 '14 at 05:46
  • @MattMcNabb (and rici), or vice versa: the behaviour of the terminal driver and the behaviour of standard I/O with unbuffered input are at best tangentially related. Even if you use `setvbuf()` to unbuffer standard input, the terminal driver won't send any data to the standard I/O functions until you hit return (or Control-D). See [Canonical vs Non-canonical Terminal Input](http://stackoverflow.com/questions/358342/canonical-vs-non-canonical-terminal-input/) for a lot more detail. Once the terminal driver makes the data available, unbuffered standard I/O will read the line one byte at a time. – Jonathan Leffler Mar 24 '14 at 05:50
  • @JonathanLeffler: yes, that was my point although I might not have said it with the grammatical clarity you provided. – rici Mar 24 '14 at 05:56
  • @Nap, why did you move your `feof` to the first `if`? – motoku Mar 24 '14 at 05:59
  • If `inputChr == EOF` then `feof(inputSource)` is going to return true. It is generally best to test loops at the top; C makes it easy with the `while ((inputChr = fgetc(inputSource)) != EOF)`. How does your `strconcat()` routine know how much space it has to work with? Your initial allocation of 1 byte is not set to null, so it can't look at the characters in the buffer, and you don't tell it how long the buffer is, and you don't pass `&thisLine` to `strconcat()` so it cannot reallocate the space. It looks like your trouble is not the I/O but the string handling and memory management! – Jonathan Leffler Mar 24 '14 at 06:03
  • @Sean, tried that, and it doesn't quite work but displays a new behaviour. It runs through my loop (and the caller, not shown) until it consumes the buffer. But the problem is I can't find a means of detecting when the buffer is empty. It just pauses back at the inputChr=fget() line, ready for the next bunch of input. – gone Mar 24 '14 at 06:09
  • @Jonathan, strconcat handles all that. Added to the code above. – gone Mar 24 '14 at 06:10
  • 1
    OK; you need a `*thisLine = '\0';` after the `malloc()` in the calling code; otherwise, you are reading out of bounds. You're also leaking like a sieve; you need to `free(str1);` in `strconcat()`. Also, if memory allocation fails, your code is going to crash and burn because of reading through a NULL pointer. You should consider using `realloc()` instead of `malloc()` each time; it will be more cost-effective over time. – Jonathan Leffler Mar 24 '14 at 06:17
  • @Jon, fixed. The problem in OP is not solved though. I still need to press Ctrl-D a second time to get the input through to the calling loop. With Sean's suggestion, it doesn't require the second Ctrl-D, but the characters go through the loop one at a time. Not what I need. The code doesn't have a problem with real files, because there's only an EOF at the end. Arrrgggghhhh – gone Mar 24 '14 at 06:27
  • `strlen(str2)` is a problem as `str2` was not passed a string but `&inputChr`. Better to re-write as `char * strconcat ( char *str1, char char2)` and call with `strconcat(thisLine, inputChr)` – chux - Reinstate Monica Mar 24 '14 at 06:30
  • Note that under select situations `fgetc(inputSource);` will return `'\0'`. May want to account for that. – chux - Reinstate Monica Mar 24 '14 at 06:35
  • I can use ungetc() to push a '\0', I'm worried that it might corrupt the incoming data since it is possible for an input file to contain that value. Another option is to handle STDIN and real files using separate routines. But I'm not sure if my Lecturer will like that. – gone Mar 24 '14 at 06:37
  • @Nap you can try `fputc('\n', stdin); (void)fgetc(stdin);`? – motoku Mar 24 '14 at 07:50
  • @Sean, I get the first statement, but don't understand the 2nd. The fputc inserts a newline into the buffer, but what does the 2nd statement mean? – gone Mar 24 '14 at 07:59
  • I feel really bad, and stupid for wasting everyone's time. I just ran some tests of `cat` while redirecting its output to a file and realised that the text that it echo's to the screen (when you press Ctrl-D the first time) is not put into `stdout`. (As Matt pointed out at the start and Jonathan alluded to, but I didn't realise the significance of that part of his comment.) Thanks for all the help, and sorry for the confusion I've been causing! – gone Mar 24 '14 at 08:35

2 Answers2

4

There must be other variations of this question with good enough answers, but here's one more.

The value returned by fgetc() (and getc() and getchar()) is an int and not a char. It has to be an int because the set of values that can be returned includes every possible value of a char and one extra value, EOF, which is negative (whereas the characters are all positive). Although EOF is most commonly -1, you should never code to that assumption.

Two things can go wrong with:

char c;

while ((c = fgetc(stdin)) != EOF)

If the type char is signed, then some characters (usually 0xFF, often ÿ, y-umlaut, Unicode U+00FF, LATIN SMALL LETTER Y WITH DIAERESIS) will be misinterpreted as indicating EOF before the EOF is reached.

If the type char is unsigned, then you will never detect EOF because the value assigned to c will be 0xFF (positive), and that will never compare equal to EOF (a negative value).

You're correct that you can't push EOF back onto the input stream with ungetc().

Note that Control-D (or Control-Z on Windows) does not add a character to the input queue. Rather, it signals that there are no more characters available (slightly simplifying things), and that means the read() system call returns 0 bytes read, which means EOF.

A trivial program to copy standard input to standard output using getchar() and putchar() is:

int c;
while ((c = getchar()) != EOF)
    putchar(c);

You can adapt that to use fgetc() or getc() and fputc() or putc() if you wish to open files and read those. The key point is the use of an int to hold the value read.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • I've probably read most of them, been looking for two days. But they don't address the issue I'm asking about. They are usually misunderstandings of the fact that fgetc() returns an int. – gone Mar 24 '14 at 05:37
  • If standard input is a terminal, you can't seek on it. Control-D is only special on a terminal. You can use `getc()` or a relative to read characters; you are guaranteed one byte of pushback with `ungetc()` — some systems provide only one byte, others provide as much pushback as you need. If you want to peek one character ahead, do so; just don't use `ungetc()` if the lookahead detected EOF. Repeated calls to `getc()` after it reports EOF will continue to report EOF unless you clear the error or do a seek operation. I'm not clear what your problem is, therefore. – Jonathan Leffler Mar 24 '14 at 05:43
  • Concerning `ungetc(EOF)`: "If the value of c equals that of the macro EOF, the operation fails and the input stream is unchanged." It is not that code cannot perform `ungetc(EOF, stream)`, it is that it has no effect. – chux - Reinstate Monica Mar 24 '14 at 06:05
  • @chux: you're quoting the standard, or the manual page you're quoting is quoting the standard — which means you're correct. However, the operation fails, so if you're testing whether the `ungetc()` works, you should not be pushing back EOF because that is guaranteed not to work. – Jonathan Leffler Mar 24 '14 at 06:09
  • The idea I was stressing (from C11 spec) was the `ungetc(EOF, stream)` is not undefined behavior. It is OK to perform but will do nothing. As you pointed out, `ungetc(EOF, stream)` will not _work_ in that the stream will not return EOF on the next `fgetc()` - unless the stream is or about to enter the EOF condition. – chux - Reinstate Monica Mar 24 '14 at 06:22
  • Anyways +1 for the best answer. – chux - Reinstate Monica Mar 24 '14 at 06:25
  • @JonathanLeffler, the problem is that I have to press Ctrl-D twice to get the data out. The first time, it loads the string 'thisLine' and returns to the getch() statement and waits for more input. Pressing Ctrl-D the second time causes it to actually return the to the caller. fgetc() blocks, and pressing Ctrl-D the first time, releases the block and then reads the input. – gone Mar 24 '14 at 06:51
  • That is normal; you won't be able to avoid it unless you use non-canonical input. Using Control-D, you cause the terminal driver to make all the input typed so far on the line available for reading, even if you've not typed return. If you type, say, `abc` and then Control-D, the terminal driver makes the three characters `abc` available (and Control-D is not seen by the program). If you type Control-D again, or type return and then Control-D, then there are no more characters available, so the underlying `read()` system call returns 0, which the standard I/O system interprets as EOF. – Jonathan Leffler Mar 24 '14 at 07:01
  • @Jonathan, Yes, I agree with your explanation, and am trying to mimic it (as per the `cat – gone Mar 24 '14 at 07:09
  • I feel really bad, and stupid for wasting everyone's time. I just ran some tests of `cat` while redirecting its output to a file and realised that the text that it echo's to the screen (when you press Ctrl-D the first time) is not put into `stdout`. (As Matt pointed out at the start and Jonathan alluded to, but I didn't realise the significance of that part of his comment.) Thanks for all the help, and sorry for the confusion I've been causing! – gone Mar 24 '14 at 08:34
0

EOF is typically an integer (not a char) and it does not have the same value as any valid character.

Normal C style would be to terminate the string you are building up with a \0. It's theoretically possible to read a NUL character, of course, and if you want to deal with these possibility you'll need to record the number of characters read as well as the buffer they were read into.

rici
  • 234,347
  • 28
  • 237
  • 341
  • Yes, been thinking of this when I mentioned using ungetc(), but it's possible that the data will have 00 in it. I'm creating a 'cat' like program for an assignment. – gone Mar 24 '14 at 05:33
  • @Nap: In that case, you'll just have to keep the count. (And note the warning that other people have given: `fgetc` returns an `int`, not a `char`.) – rici Mar 24 '14 at 05:36
  • Any ideas on that? Counting is the easy part. What do I compare against? – gone Mar 24 '14 at 06:27
  • @Nap: You compare against EOF. Your problem is with the way terminal i/o works; if you want to send eof other than at the beginning of a line, you need to type ctl-d twice. – rici Mar 24 '14 at 06:37
  • Fair enough, but when you use `cat` and read from `stdin`, if you say type `abc` then press Ctrl-D, it echos `abc` immediately and on the same line. If you press Ctrl-D again, it exits. With the code above, I have to press Ctrl-D twice before it echoes. – gone Mar 24 '14 at 07:06
  • I feel really bad, and stupid for wasting everyone's time. I just ran some tests of `cat` while redirecting its output to a file and realised that the text that it echo's to the screen (when you press Ctrl-D the first time) is not put into `stdout`. (As Matt pointed out at the start and Jonathan alluded to, but I didn't realise the significance of that part of his comment.) Thanks for all the help, and sorry for the confusion I've been causing! – gone Mar 24 '14 at 08:35