13

Possible Duplicate:
Handling \r\n vs \n newlines in python on Mac vs Windows

I'm a little bit confused by something, and I'm wondering if this is a python thing. I have a text file that uses Windows line endings ("\r\n"), but if I iterate through some of the lines in the file, store them in a list, and print out the string representation of the list to the console, it shows "\n" line endings. Am I missing something?

Community
  • 1
  • 1
Jon Martin
  • 3,252
  • 5
  • 29
  • 45
  • Might help you in solving the probplem: http://stackoverflow.com/questions/4599936/handling-r-n-vs-n-newlines-in-python-on-mac-vs-windows – Kirill Dubovikov May 28 '12 at 13:05
  • 3
    Line ending are somewhat confusing. Python may automatically handle them for you unless you open your file in binary (`open(..., 'rb')`) mode, depending on your platform. – Katriel May 28 '12 at 13:07

3 Answers3

22

Yes, it's a python thing; by default open() opens files in text mode, where line endings are translated depending on what platform your code is running on. You'll have set newline='' in the open() call to ask for line endings to be passed through unaltered.

Python 2's standard open() function doesn't support this option, and only opening in binary mode would prevent the translation, but you can use the Python 3 behaviour by using io.open() instead.

From the documentation on open:

newline controls how universal newlines mode works (it only applies to text mode).

[...]

  • When reading input from the stream, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n' before being returned to the caller. If it is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated.
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
10

Opening the file in binary mode will avoid this in Py2 on Windows. However, in Py3 (and in Py2.6+ if you use io.open instead of the builtin), binary mode vs text mode means something well defined and platform independent, and doesn't affect universal newlines. Instead, you can do:

file = open(filename, 'r', newline='')

And newlines won't be normalised.

lvc
  • 34,233
  • 10
  • 73
  • 98
7

What you should be doing is to open the file with universal newline support (for Python 2.x). This is done with a mode of "U" or "rU". Any type of newline is then supported. The following documentation is given in the python manual http://docs.python.org/library/functions.html#open:

In addition to the standard fopen() values mode may be 'U' or 'rU'. Python is usually built with universal newline support; supplying 'U' opens the file as a text file, but lines may be terminated by any of the following: the Unix end-of-line convention '\n', the Macintosh convention '\r', or the Windows convention '\r\n'. All of these external representations are seen as '\n' by the Python program. If Python is built without universal newline support a mode with 'U' is the same as normal text mode. Note that file objects so opened also have an attribute called newlines which has a value of None (if no newlines have yet been seen), '\n', '\r', '\r\n', or a tuple containing all the newline types seen.

For Python 3, there is a newline option to open which controls the behaviour of newlines. Looking at the documentation, it appears that universal newline support is the default.

xioxox
  • 2,526
  • 1
  • 22
  • 22