3

I have a line like this in my python script:

data = sys.stdin.read()

Then I run the script with file redirecting on Windows:

>> python test.py < binary_file

If binary_file contains \x1a (ctrl-Z) that is EOF in Windows, data will only have the string before \x1a. I know this can be fixed with open("...", "rb") for a regular file.

How would I handle this for sys.stdin?

shx2
  • 61,779
  • 13
  • 130
  • 153
yuwen
  • 109
  • 5
  • Pressing CTRL-Z in a console window closes the input which causes the program to get an end-of-file. However, it's not the actual byte `0x1a` that causes EOF. The value `0x1a` is just a normal byte of data, like any other byte of data. – Some programmer dude Apr 11 '14 at 08:52
  • On the other hand, you will probably have *other* problems reading binary data from standard input like that on a Windows system. The most important is that the byte sequence `0x0d 0x0a` will be converted to `0x0a` only. That is, carriage-return followed by newline will be converted to newline only (`'\r\n'` -> `'\n'`) – Some programmer dude Apr 11 '14 at 08:56
  • I have tested files containing 0x1a, the bytes that following 0x1a can not be read into `data'. – yuwen Apr 11 '14 at 09:13
  • Then it's the console program that reads and checks for this byte, because there is really nothing special with this specific value. A byte of data is a byte of data is a byte of data... But like I said in the other comment, reading binary data from a file not opened in text mode is going to be trouble no matter what, don't do it. – Some programmer dude Apr 11 '14 at 09:26
  • I wouldn't be surprised in the least if the pipe/console handler on Windows would stop when there is a `\0x1a` in the input. – Aaron Digulla Apr 11 '14 at 09:52

1 Answers1

3

My next step would be to try the fileinput module but my gut feeling is that cmd.exe (or the code which handles pipes) really processes the stream, looks for \0x1a bytes and sends you an EOF.

If that's the case, there is nothing you can do; the OS simply won't let you read past this byte. There is no way to "switch" stdin to binary mode since this handle is opened by the runtime or the OS and then passed to Python.

As a workaround, you can try to install Cygwin or MSys which gives you a real shell (instead of an emulation of bugs creates in the 1980s).

Or try PowerShell. If you're lucky, they didn't reimplement this bug in there.

Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820
  • Thanks. I'll check fileinput module. As for Powershell, I found redirection operator '<' isn't even suported in versoin1.0. I might as well change my code. – yuwen Apr 14 '14 at 01:53
  • PowerShell uses object pipes: http://stackoverflow.com/questions/11447598/redirecting-standard-input-output-in-windows-powershell – Aaron Digulla Apr 14 '14 at 08:48