Getting rid of \n when using .readlines()

Question

I have a .txt file with values in it.

The values are listed like so:

Value1
Value2
Value3
Value4

My goal is to put the values in a list. When I do so, the list looks like this:

['Value1\n', 'Value2\n', ...]

The \n is not needed.

Here is my code:

t = open('filename.txt')
contents = t.readlines()

Generally you do **not** want to read in all the lines first, store in a buffer, then strip newlines/ `splitlines()` - that needlessly wastes 2x memory if the file is large. You want to `rstrip()` each line's newline as you read it and iterate. — smci, Nov 29 '18 at 10:22

score 537 · Answer 1 · edited Jul 11 '15 at 23:49

537

This should do what you want (file contents in a list, by line, without \n)

with open(filename) as f:
    mylist = f.read().splitlines()

edited Jul 11 '15 at 23:49

Community

1
1

answered Dec 24 '13 at 06:44

user3131651

5,535
1
11
3

2

mylist = [i for i in mylist if i != ''] – TheRutubeify Mar 22 '18 at 21:18
7

The url proposed from @bfrederix is broken. Here an archive.org copy https://web.archive.org/web/20160215030807/http://axialcorps.com/2013/09/27/dont-slurp-how-to-read-files-in-python/ – Paolo Melchiorre Oct 24 '18 at 09:45
2

Best solution for *small files*. – Anselmo Blanco Dominguez May 04 '21 at 16:14

score 155 · Answer 2 · answered Mar 05 '13 at 20:44

155

I'd do this:

alist = [line.rstrip() for line in open('filename.txt')]

or:

with open('filename.txt') as f:
    alist = [line.rstrip() for line in f]

answered Mar 05 '13 at 20:44

hughdbrown

47,733
20
85
108

33

This can strip more than just `\n`. – gronostaj Nov 09 '16 at 16:27
24

Trailing whitespace (space, tab, CR, LF, etc.) is never desirable, in my experience. There is no data or computer language I have dealt with in over twenty years that wanted trailing whitespace. So, yes, it strips more than \n. Chances are, you won't miss it. – hughdbrown Jan 20 '18 at 05:06
5

One situation where this could hurt would be right-stripping a tab-separated value file in which some rows had multiple empty values in their right-most cells. Those rows would have length shorter than the others if one were to split on \t ... – duhaime Jun 02 '18 at 18:46
4

@duhaime You are kind of switching context. If someone were asking, "How can I read in a file of CR-separated rows with tab-separated fields?" I would definitely recommend the use of python's CSV module. I would not be giving tips that are applicable to a purely text file with CR-separated lines of data. So tab-separated values is a circumstance where that would be bad and if stated that way, this answer would never be my recommendation. – hughdbrown Jun 03 '18 at 19:18
4

@hughdbrown amen, just wanted to flag this as a potential example of gronostaj's comment as this is the first Google result for stripping \n with readlines. Your point is understood though! – duhaime Jun 04 '18 at 13:27
since Python 3.9 you can use `.removesuffix('\n')` to remove single newline instead – maciek Mar 01 '23 at 10:35

Martijn Pieters · Answer 3 · 2016-09-28T10:50:57.963

141

You can use .rstrip('\n') to only remove newlines from the end of the string:

for i in contents:
    alist.append(i.rstrip('\n'))

This leaves all other whitespace intact. If you don't care about whitespace at the start and end of your lines, then the big heavy hammer is called .strip().

However, since you are reading from a file and are pulling everything into memory anyway, better to use the str.splitlines() method; this splits one string on line separators and returns a list of lines without those separators; use this on the file.read() result and don't use file.readlines() at all:

alist = t.read().splitlines()

edited Sep 28 '16 at 10:50

answered Mar 05 '13 at 20:25

Martijn Pieters

1,048,767
296
4,058
3,343

11

`file.read().splitlines()` does the job perfectly, yet I need to visit this page EVERY time just to remind myself how to do this. God, I wish they included this in an intuitive way like `file.readlines(newlines=False)` – pcko1 Dec 16 '20 at 00:12
6

@pcko1: I don't feel that that's more intuitive though. I always use the file object as an iterable, anyway (so, would use `list(file)` instead of `file.readlines()`), and so know to expect newlines. Mostly, try to handle lines from a file *as a stream*, by iterating. `for line in file: dosomething(line)` or `[dosomething(line) for line in file]`, rather than read all lines into memory. – Martijn Pieters Dec 17 '20 at 13:47

score 27 · Answer 4 · answered Jan 18 '17 at 17:16

27

After opening the file, list comprehension can do this in one line:

fh=open('filename')
newlist = [line.rstrip() for line in fh.readlines()]
fh.close()

Just remember to close your file afterwards.

answered Jan 18 '17 at 17:16

Lisle

1,620
2
16
22

9

_Just remember to close your file afterwards._ Or don't risk it, and use a context manager. – AMC Feb 15 '20 at 01:03
Don't even need readlines. File itself is an iterator, so you fan loop it – Manny Fleurmond Oct 18 '22 at 03:47

score 16 · Answer 5 · answered Jul 06 '17 at 15:26

16

I used the strip function to get rid of newline character as split lines was throwing memory errors on 4 gb File.

Sample Code:

with open('C:\\aapl.csv','r') as apple:
    for apps in apple.readlines():
        print(apps.strip())

answered Jul 06 '17 at 15:26

Yogamurthy

988
15
22

3

By using `.readlines()` like this, you're effectively iterating over the entire file twice, while also keeping the whole thing in memory at once. – AMC Feb 15 '20 at 01:05

score 15 · Answer 6 · answered Mar 05 '13 at 20:23

for each string in your list, use .strip() which removes whitespace from the beginning or end of the string:

for i in contents:
    alist.append(i.strip())

But depending on your use case, you might be better off using something like numpy.loadtxt or even numpy.genfromtxt if you need a nice array of the data you're reading from the file.

score 11 · Answer 7 · answered Mar 05 '13 at 22:53

from string import rstrip

with open('bvc.txt') as f:
    alist = map(rstrip, f)

Nota Bene: rstrip() removes the whitespaces, that is to say : \f , \n , \r , \t , \v , \x and blank ,
but I suppose you're only interested to keep the significant characters in the lines. Then, mere map(strip, f) will fit better, removing the heading whitespaces too.

If you really want to eliminate only the NL \n and RF \r symbols, do:

with open('bvc.txt') as f:
    alist = f.read().splitlines()

splitlines() without argument passed doesn't keep the NL and RF symbols (Windows records the files with NLRF at the end of lines, at least on my machine) but keeps the other whitespaces, notably the blanks and tabs.

.

with open('bvc.txt') as f:
    alist = f.read().splitlines(True)

has the same effect as

with open('bvc.txt') as f:
    alist = f.readlines()

that is to say the NL and RF are kept

score 6 · Answer 8 · answered Sep 27 '16 at 19:46

I had the same problem and i found the following solution to be very efficient. I hope that it will help you or everyone else who wants to do the same thing.

First of all, i would start with a "with" statement as it ensures the proper open/close of the file.

It should look something like this:

with open("filename.txt", "r+") as f:
    contents = [x.strip() for x in f.readlines()]

If you want to convert those strings (every item in the contents list is a string) in integer or float you can do the following:

contents = [float(contents[i]) for i in range(len(contents))]

Use int instead of float if you want to convert to integer.

It's my first answer in SO, so sorry if it's not in the proper formatting.

`f.read().splitlines()` will be more efficient, I guess. And for int or float conversion, `map(int, f.read().splitlines())` might be better. — thiruvenkadam, Sep 28 '16 at 10:58
By using .readlines() like this, you're effectively iterating over the entire file twice, while also keeping the whole thing in memory at once. — AMC, Feb 15 '20 at 01:05

score 2 · Answer 9 · answered Aug 25 '15 at 20:49

I recently used this to read all the lines from a file:

alist = open('maze.txt').read().split()

or you can use this for that little bit of extra added safety:

with f as open('maze.txt'):
    alist = f.read().split()

It doesn't work with whitespace in-between text in a single line, but it looks like your example file might not have whitespace splitting the values. It is a simple solution and it returns an accurate list of values, and does not add an empty string: '' for every empty line, such as a newline at the end of the file.

Are there even any benefits to using this solution? You avoid typing a whole 5 characters? — AMC, Feb 15 '20 at 01:07

score -1 · Answer 10 · answered Apr 28 '16 at 15:20

-1

with open('D:\\file.txt', 'r') as f1:
    lines = f1.readlines()
lines = [s[:-1] for s in lines]

answered Apr 28 '16 at 15:20

jperezmartin

407
4
19

3

By using .readlines() like this, you're effectively iterating over the entire file twice, while also keeping the whole thing in memory at once. Not only that, but using `s[:-1]` can remove the last non-newline character of the file. I see no benefit to using this over any other solution. – AMC Feb 15 '20 at 01:07

score -3 · Answer 11 · edited Jun 16 '16 at 18:21

-3

The easiest way to do this is to write file.readline()[0:-1] This will read everything except the last character, which is the newline.

edited Jun 16 '16 at 18:21

Muhammad Abdul Arif Sarker

1,211
4
15
25

answered Jun 16 '16 at 17:17

anon

25

The last character isn't always a newline. It possible to create a text file that doesn't end in a newline (although most editors do include one). – Flimm Jan 17 '18 at 13:42
_which is the newline._ **The** newline? This question is quite clearly about a file with multiple lines, where we want to remove the newline from each one. – AMC Feb 15 '20 at 01:09

Getting rid of \n when using .readlines()

11 Answers11

Linked

Related