0

I am trying to read text from stdin, do stuff with that text then send the result to stdout.

char_stream = sys.stdin
input_text = char_stream.read()

output_text=do_stuff_with_text(input_text)
sys.stdout.write(output_text)

If I feed say, utf8 to it, it fails to read the charstream, I assume it expects ASCII. I googled and it appears that I read UTF-8 like this

codecs.getreader("utf-8")(sys.stdin)

but then this will only be compatible with utf8, which is not enough for me since I want to be able to pipe in any text file. Indeed I also need to encode the output as utf8.

What is the common practice here, the one used in, say, grep? I know it is not written in python, but it's still a valid example. I can pipe in text in any format and it never throws an error. How would i implement this in python?

jdeo
  • 1
  • 1
    See [this question](http://stackoverflow.com/questions/436220/python-is-there-a-way-to-determine-the-encoding-of-text-file). Even though it talks about text files, the principle is the same for trying to detect encoding. At some level you need to make an assumption about what the encoding is. – merlin2011 Jan 21 '15 at 23:45
  • read should just return bytes ... it doesnt usually care about encoding ... write on the other hand ... – Joran Beasley Jan 21 '15 at 23:55

0 Answers0