Removing a space from a list element?

Question

I wanted to remove the spaces from my list element and separate them into different list elements. For example, if I have the list:

['Hello world', 'testing', 'testing two']

I'd want the list to look like:

['Hello', 'world', 'testing', 'testing', 'two']

The issue I'm having is that i am reading from a file and I already stripped the newline characters and when I tried to strip the spaces, it doesn't seem to work. Below is my code:

with open(fname, 'r') as f:
  words = [line.strip().strip(' ') for line in f]
print words

This just prints out what I mentioned previously above, with the list elements still having spaces.

If anyone could help me out, that'd be great! Thanks!

possible duplicate of [returning a list of words after reading a file in python](http://stackoverflow.com/questions/13259288/returning-a-list-of-words-after-reading-a-file-in-python) — kojiro, Oct 20 '13 at 01:53

score 3 · Answer 1 · answered Oct 20 '13 at 01:47

3

I would do something like this:

" ".join(list).split(" ")

That will join the list together and then split it apart. There are probably somewhat more efficient ways, but this way is simple.

answered Oct 20 '13 at 01:47

Eric Pauley

1,709
1
20
30

Steven Rumbalski · Accepted Answer · 2013-10-20T03:33:33.390

2

split() splits on any white space by default, so you can do the whole file in one easy step.

words =  f.read().split()

If you want to avoid reading the whole file into memory with f.read():

words = [word for line in f for word in line.split()]

edited Oct 20 '13 at 03:33

answered Oct 20 '13 at 01:54

Steven Rumbalski

44,786
9
89
119

1

I thought about posting this as an answer ... It can have some problems for really big files, but generally these days that's probably not a concern. – mgilson Oct 20 '13 at 02:00
@mgilson: I thought about the large file issue as well, but figured that if he has enough memory to hold all the words individually, he probably has enough memory for the whole chunk. – Steven Rumbalski Oct 20 '13 at 03:24

score 1 · Answer 3 · edited May 23 '17 at 11:57

.strip only removes stuff from the beginning or end of a string. What you want is to split the sting on whitespace:

lines_split = [line.split() for line in f]

This will give you a nested list which you can easily flatten. See for example this answer or this one.

My prefered approach here would be to write a simple generator to yield a word at a time. Then you can turn it into a list later if you need to:

def get_words(filename):
    with open(filename) as fin:
        for line in fin:
            for word in line.split():
                yield word

There's some magic you can do to condense this down with itertools, but this should suffice for now.

zwol · Answer 4 · 2013-10-20T01:52:56.883

You are looking for the split method. The simplest way to do what you want looks like this:

words = []
with open(fname) as f:
  for line in f:
    words.extend(line.split())

and the slightly cleverer method looks like this:

import itertools
with open(fname) as f:
  words = list(itertools.chain.from_iterable(l.split() for l in f))

I don't know which is faster. Note that when called without a separator argument, split effectively does what strip does as well as splitting on interior whitespace, so you needn't bother calling strip first.

mshsayem · Answer 5 · 2013-10-20T02:23:39.147

0

I like Zonedabone's answer. But here is another way:

>>> from itertools import chain
>>> l = ['Hello world', 'testing', 'testing two']
>>> result = list(chain.from_iterable(w.split() for w in l))
# ['Hello', 'world', 'testing', 'testing', 'two']

edited Oct 20 '13 at 02:23

answered Oct 20 '13 at 01:52

mshsayem

17,557
11
61
69

1

for what it's worth, `chain.from_iterable(w.split() for w in l)` is generally preferable to `chain(*[...])`. The latter pretty much gets rid of all of the advantage of using iterable objects in the first place. – mgilson Oct 20 '13 at 02:12

Removing a space from a list element?

5 Answers5