15

I have the following Python script that reads numbers and outputs an error if the input is not a number.

import fileinput
import sys
for line in (txt.strip() for txt in fileinput.input()):
    if not line.isdigit():
        sys.stderr.write("ERROR: not a number: %s\n" % line)

If I get the input from stdin, I have to press Ctrl + D twice to end the program. Why?

I only have to press Ctrl + D once when I run the Python interpreter by itself.

bash $ python test.py
1
2
foo
4
5
<Ctrl+D>
ERROR: not a number: foo
<Ctrl+D>
bash $
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Michael Kristofik
  • 34,290
  • 15
  • 75
  • 125
  • I don't get that effect in OSX. However, if I hit directly after hitting 5, (without an intervening carriage-return) I do, and even `cat` does that. – Alex Brown Jan 29 '10 at 15:31
  • @Kristo: Your example should be formatted to show `` on the same line as `5`. If you're seeing the behavior that your example shows as of now, something is wrong. – Alok Singhal Jan 29 '10 at 15:36
  • @Alok: My example is formatted exactly as I typed it. If I change the code to use `sys.stdin.readlines()`, then the first ends the program. – Michael Kristofik Jan 29 '10 at 16:19
  • @Kristo Strange indeed. Note when the "not a number" is displayed... is that what you expect, too? Did you try several terminal emulators? What platform are you using? If Linux, could you try the console for instance (you probably can get one by typing Ctrl-Alt-F2)? – Pascal Cuoq Jan 29 '10 at 16:43
  • @Pascal Cuoq: Yes I'm on linux. I get the same results in xterm, GNOME Terminal, Konsole, and the console. I actually expected to see the error message print immediately after entering 'foo' but it doesn't appear until after the first Ctrl+D, regardless of which way I write the code. – Michael Kristofik Jan 29 '10 at 16:57
  • @Kristo: I see the same behavior on Ubuntu 9.10 with Python 2.5, 2.6, 3.0 under Bash, Z, Korn and csh. I don't see anything in the docs and nothing jumps out in a quick look at `/usr/lib/python2.6/fileinput.py` however closer inspection might lead to something. – Dennis Williamson Jan 29 '10 at 17:43

5 Answers5

16

In Python 3, this was due to a bug in Python's standard I/O library. The bug was fixed in Python 3.3.


In a Unix terminal, typing Ctrl+D doesn't actually close the process's stdin. But typing either Enter or Ctrl+D does cause the OS read system call to return right away. So:

>>> sys.stdin.read(100)
xyzzy                       (I press Enter here)
                            (I press Ctrl+D once)
'xyzzy\n'
>>>

sys.stdin.read(100) is delegated to sys.stdin.buffer.read, which calls the system read() in a loop until either it accumulates the full requested amount of data; or the system read() returns 0 bytes; or an error occurs. (docs) (source)

Pressing Enter after the first line caused the system read() to return 6 bytes. sys.stdin.buffer.read called read() again to try to get more input. Then I pressed Ctrl+D, causing read() to return 0 bytes. At this point, sys.stdin.buffer.read gave up and returned just the 6 bytes it had collected earlier.

Note that the process still has my terminal on stdin, and I can still type stuff.

>>> sys.stdin.read()        (note I can still type stuff to python)
xyzzy                       (I press Enter)
                            (Press Ctrl+D again)
'xyzzy\n'

OK. This is the part that was busted when this question was originally asked. It works now. But prior to Python 3.3, there was a bug.

The bug was a little complicated --- basically the problem was that two separate layers were doing the same work. BufferedReader.read() was written to call self.raw.read() repeatedly until it returned 0 bytes. However, the raw method, FileIO.read(), performed a loop-until-zero-bytes of its own. So the first time you press Ctrl+D in a Python with this bug, it would cause FileIO.read() to return 6 bytes to BufferedReader.read(), which would then immediately call self.raw.read() again. The second Ctrl+D would cause that to return 0 bytes, and then BufferedReader.read() would finally exit.

This explanation is unfortunately much longer than my previous one, but it has the virtue of being correct. Bugs are like that...

Jason Orendorff
  • 42,793
  • 6
  • 62
  • 96
  • it is probably a bug: it should be [enough to press `Ctrl+D` once *at the begining of a line*](http://stackoverflow.com/a/21261742/4279). Though I can't reproduce it (a single `Ctrl+D` is enough to end `sys.stdin.read()` if Enter is pressed on both Python 2 and 3 -- you need to press `Ctrl+D` twice only in the middle of a line (ICANON flag is set)). – jfs Feb 02 '15 at 19:44
  • @J.F.Sebastian The weirdness in this case was on the Python side, not in the OS's terminal implementation. Ctrl+D was sending EOF both times it was pressed, just as you say. But the implementation of `sys.stdin.read()` was a simple loop that kept calling `read` until it returned zero bytes. – Jason Orendorff Feb 09 '15 at 22:27
  • If it is not clear: I've tested it on both Python 2 and 3: *single* Ctrl+D is enough (Ubuntu 14.04). `read` returns zero bytes at the beginning of a line. – jfs Feb 09 '15 at 22:35
  • Right. The behavior changed in Python 3.3 according to http://bugs.python.org/issue5505, and it *was* considered a bug in Python. – Jason Orendorff Feb 09 '15 at 22:42
  • @J.F.Sebastian I'm honestly not totally clear how the commit cited in that issue fixed the problem though. – Jason Orendorff Feb 09 '15 at 22:54
  • @J.F.Sebastian OK, after thinking this through a bunch of times I think I finally get it and have updated my answer. – Jason Orendorff Feb 11 '15 at 01:21
  • I don't think this kind of bug is rare. GNU `patch`, for example, has it. If you paste a patch into the terminal, you have to type Ctrl+D twice. The bug is fundamentally the same as the one described here: The `open_patch_file()` function in `patch` contains a loop that calls `fread()` repeatedly until it returns 0 bytes. But the implementation of `fread()` contains a similar loop. – Jason Orendorff Feb 11 '15 at 22:46
9

Most likely this has to do with Python the following Python issues:

  • 5505: sys.stdin.read() doesn't return after first EOF on Windows, and
  • 1633941: for line in sys.stdin: doesn't notice EOF the first time.
Alok Singhal
  • 93,253
  • 21
  • 125
  • 158
5

I wrote an explanation about this in my answer to this question.

How to capture Control+D signal?

In short, Control-D at the terminal simply causes the terminal to flush the input. This makes the read system call return. The first time it returns with a non-zero value (if you typed something). The second time, it returns with 0, which is code for "end of file".

Community
  • 1
  • 1
Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
  • 1
    That's what I don't understand. The first in my example is on the first character of a new line, so I would expect it to act as EOF. If I change the code to use `sys.stdin.readlines()` instead, then the first ends the program. – Michael Kristofik Jan 29 '10 at 16:21
0

Using the "for line in file:" form of reading lines from a file, Python uses a hidden read-ahead buffer (see http://docs.python.org/2.7/library/stdtypes.html#file-objects at the file.next function). First of all, this explains why a program that writes output when each input line is read displays no output until you press CTRL-D. Secondly, in order to give the user some control over the buffering, pressing CTRL-D flushes the input buffer to the application code. Pressing CTRL-D when the input buffer is empty is treated as EOF.

Tying this together answers the original question. After entering some input, the first ctrl-D (on a line by itself) flushes the input to the application code. Now that the buffer is empty, the second ctrl-D acts as End-of-File (EOF).

file.readline() does not exhibit this behavior.

Tony
  • 3,425
  • 10
  • 30
  • 46
0

The first time it considers it to be input, the second time it's for keeps!

This only occurs when the input is from a tty. It is likely because of the terminal settings where characters are buffered until a newline (carriage return) is entered.

jathanism
  • 33,067
  • 9
  • 68
  • 86