2

The code below takes a JPEG image and converts it to a string. That string is then saved into the image variable. Then, the string is written to a.jpg using File IO and then written to b.jpg by me piping stdout to the file.

import thumb
import sys

x = thumb.Thumbnail('test.jpg')
x.generate(56, 56)

image = str(x)

with open('a.jpg', 'wb') as f:
    # saving to a.jpg
    f.write(image)

# saving to b.jpg
sys.stdout.write(image)

Usage:

python blah.py > b.jpg

This results in two image files (a.jpg and b.jpg). These images should be identical... But they aren't.

a.jpg
b.jpg

I can see, by looking at each image in Notepad, that linebreaks are, somehow, being added to b.jpg. Resulting in a corrupted image.

Why is a.jpg different to b.jpg?

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
dave
  • 7,717
  • 19
  • 68
  • 100
  • 4
    sys.stdout.mode is 'w', I think. See, e.g., http://stackoverflow.com/questions/2374427/python-2-x-write-binary-output-to-stdout – DSM Jan 25 '11 at 04:44
  • Your shell is probably interpreting your output when you redirect it through standard out. Are you on linux? Using bash? – Falmarri Jan 25 '11 at 04:45
  • @DSM: You should post that as an answer. – Adam Rosenfield Jan 25 '11 at 04:49
  • The unix tendency to use stdout to communicate between programs is a BAD idea for binary data. Pleas do not do it. Running your software without redirecting to a file will mess up the terminal, etc. Do **not** do it. Please! – Lennart Regebro Jan 25 '11 at 06:30
  • @Lennart: It's actually writing to my web browser. It's a CGI script. I used stdout as an example as it had the same problem and was simpler to describe :} – dave Jan 25 '11 at 07:34
  • Ah, I see. Yes, CGI is indeed an example of this very bad pattern in Unix. – Lennart Regebro Jan 25 '11 at 08:16

1 Answers1

2

You write your data to a.jpg as binary, while b.jpg get written in text mode. When in binary mode otherwise special characters (such as newlines or EOF marker) are not treated special, while in text mode they are.

In Python 3 you can switch modes:

The standard streams are in text mode by default. To write or read binary data to these, use the underlying binary buffer. For example, to write bytes to stdout, use sys.stdout.buffer.write(b'abc').


Untested (Python 2):

import sys, os

binout = os.fdopen(sys.stdout.fileno(), 'wb')
binout.write(b'Binary#Data...')
miku
  • 181,842
  • 47
  • 306
  • 310