3

Playing around with code examples from K&R in Codeblocks on Windows 10 (Danish language). The following example works as expected:

#include <stdio.h>

int main() {
    char c = 'a';
    putchar(c);
}

However, the following prints a series of boxes with question marks, the same number as the number of characters I type:

#include <stdio.h>

int main() {
    char c;

    while (c = getchar() != '\n') {
        putchar(c);
    }
}

So it looks like an encoding issue. When run, a command prompt opens with "C:\Users\username\Desktop\filename.exe" in the header, and my username contains the Danish character "å" which is replaced by a "Õ". The command prompt uses the CP 850 character set.

(By the way, I'm not checking if the character equals EOF, since that produces odd results. Pressing enter prints the expected number of boxes, plus one for \n, but it doesn't end the program.)

Danny
  • 225
  • 1
  • 7
  • 2
    The [`getchar`](https://en.cppreference.com/w/c/io/getchar) returns an ***`int`***. This is important if you ever what to check for `EOF` (which you really should do even for input from `stdin`). – Some programmer dude Feb 05 '19 at 08:33
  • I don't know why your output is strange, but `EOF` is of type `int`, and `getchar` returns `int` not `char`. Stupid function name? Very much so. Which is why bothering with all the old stdio.h crap in K&R is mostly a waste of time. Nobody writes console programs since 20 years back. – Lundin Feb 05 '19 at 08:33
  • And what is the ouput of ypur program if the input is `ABC`? – Jabberwocky Feb 05 '19 at 08:47

4 Answers4

8

You are seeing a problem of operator precedence here. As you can see on this chart, = has a lower precedence than !=.

This means that getchar() != '\n' is evaluated first.

To the compiler your code looks like this:

#include <stdio.h>

int main() {
    char c;

    while (c = (getchar() != '\n')) { 
        putchar(c);
    }
}

Since 'c' is getting an incorrect value (the true/false evaluation of the expression), the output is incorrect, and the program gives the behavior you are seeing, however

#include <stdio.h>

int main() {
    char c;

    while ((c = getchar()) != '\n') { //<----notice brackets around c=getchar 
        putchar(c);
    }
}

gives the output you are expecting. This illustrates the fact that you should always put brackets around such expressions to be safe.

hat
  • 781
  • 2
  • 14
  • 25
3

This line is bad.

while (c = getchar() != '\n') 

It should be:

while ((c = getchar()) != '\n') 
  • Alltough correct, I feel this is a bit of a minimal answer. Any explanation why the one is bad and the other is good would suite this answer well. – alk Feb 05 '19 at 10:11
2

There are already some correct answers within the scope of the question but there are a couple of wider problems that you need to address.

Firstly getchar() returns an int and it is important that you define the variable that takes the return value as an int so you can differentiate errors and end of file from valid chars.

Secondly, if you receive end of file or there is an error on stdin before the program encounters a \n, your code will loop forever. This is what the man page on my laptop says about getchar()

If successful, these routines return the next requested object from the stream. Character values are returned as an unsigned char converted to an int. If the stream is at end-of-file or a read error occurs, the routines return EOF.

So once getchar() returns EOF it will return EOF all the time. You need to address this in your loop condition:

#include <stdio.h>

int main() 
{
    int c;    // c declared as int

    while ((c = getchar()) != EOF && c != '\n')) 
    { 
        putchar(c);
    }
    if (c == EOF) 
    {
        // handle errors and end of file as you see fit
    }
}
JeremyP
  • 84,577
  • 15
  • 123
  • 161
  • Nope, in unix at least, if you `getchar()` from a tty, it returns `EOF` only on two consecutive (without any other character in between) `Ctrl-D` characters (or if `Ctrl-D` is the first character in the stream of data read by the program, or on a `\n` followed by a `Ctrl-D`). But if you call it again, you can continue reading input without any problem. `EOF` is returned only on such condition (or several others, like invalidating the tty device) – Luis Colorado Feb 06 '19 at 05:47
  • @LuisColorado Nope. The man page for getchar on my Mac says: "The end-of-file condition is remembered, even on a terminal, and all subsequent attempts to read will return EOF until the condition is cleared with clearerr(3)." – JeremyP Feb 06 '19 at 09:23
  • @LuisColorado and I was sad enough to write a program to test it. – JeremyP Feb 06 '19 at 09:24
  • It's possible different implementations do different things... but I have tested in some unices... Solaris, unix v7, Linux Debian (but i guess all linuxes do) behave as I told. All BSDs I've tested (OpenBSD, FreeBSD and NetBSD, and Mac OSX --- I've not tested this last one, but you say and it's a derived work from FreeBSD, so I trust you) behave as you say. I think is the implementation of the stdio package that makes the difference. I have no System III or System V unix to test, but from v7 & Solaris, I guess they will do as at&t unix. – Luis Colorado Feb 06 '19 at 15:36
0

Edit: You get the boxes because of the lack of parenthesis around the assignment, look at this question for reference as to why you should have parenthesis around an assignment used as a truth value...

Also, there is something else that is also wrong with this program, consider this example:-

For example:

What you actually wanted:-

ABCD
< newline >

What you actually typed:-

ABCD

And since the program didn't find the '\n' anywhere in the code, it leads to undefined behavior since it goes out of bounds to find it...

There are two possible solutions when your input does not contain a '\n':-

  • Use EOF (Suggested by many since it the best possible solution for accepting every input...)

    int main() {
        char c;
        while ((c = getchar()) != '\n') /* Always remember to put parenthesis around
                                           an assignment in a condition... */
            putchar(c);
    }
    
  • Add a newline to your input:-

    int main() {
        char c;
        // Use fputc to modify input...
        fputc('\n', stdin);
        while ((c = getchar()) != '\n') /* Always remember to put parenthesis around
                                           an assignment in a condition... */
            putchar(c);
    }
    

    But, beware! This method will stop at the first iteration of newline it gets, so if you have something outside of the '\n', well it won't be printed...

Ruks
  • 3,886
  • 1
  • 10
  • 22
  • The stuff about going out of bounds because of not seeing the `\n` is wrong. If the user never hits the carriage return, the program will block in `getchar()` If the user somehow does terminate the input stream without hitting return, `getchar()` will forevermore return `EOF`, which is not `\n` and the program will loop printing character `0xff` (assuming `EOF` is defined as `-1`). – JeremyP Feb 05 '19 at 10:08
  • The correct thing to do is define `c` as `int` and make the while loop `while ((c = getchar()) != EOF && c != '\n')` – JeremyP Feb 05 '19 at 10:09