how do I write a python one-liner to create a list of all words in a file?

Question

Given an ascii file I'd like a python one-liner to create a list of words in the file.

Let tfile contain the following 2 lines

abc xyz abc mno
tuv xyz qrs abc

There are 8 words in the file and 5 unique words.

If I assign

file='tfile'

the following one-liner will create a set with the 5 unique words in tfile

s=set(open(file).read().split())

where the output is {'abc', 'mno', 'qrs', 'tuv', 'xyz'}

However if I try something similar to get a list of all words in the file, namely

l=list(open(file).read().split(" "))

I get the following

['abc', 'xyz', 'abc', 'mno\ntuv', 'xyz', 'qrs', 'abc\n']

which doesn't quite work because the last word of each line has a newline appended to it.

If I add strip() to the statement as in

l=list(open(file).read().strip().split(" "))

I get the following, which is better, but still contains a newline which is appended to the first word of the next line in the file.

['abc', 'xyz', 'abc', 'mno', '\ntuv', 'xyz', 'qrs', 'abc']

So 2 questions: (1) is there a one-liner which does what I want? and (2) why does the set of unique words work so nicely, without getting any newline characters?

before `split(' ')` do `.replace('\n', ' ' )` to replace linebreaks with an space — Ulises Bussi, Nov 21 '21 at 02:45
Did you add an argument to `split` and not even notice, or is this someone else's code that you're trying to understand and modify? — TigerhawkT3, Nov 21 '21 at 03:05

score 0 · Accepted Answer · answered Nov 21 '21 at 02:46

You have added a " " as a argument to the split in the second example. At first, you have

s=set(open(file).read().split())

But then, you do

l=list(open(file).read().split(" "))

The key is the split(" "). Without it Python will just split on anything considered whitespace, but with it it is restricted to spaces.

So all you need is

l=list(open(file).read().split())

score 0 · Answer 2 · answered Nov 21 '21 at 02:51

0

If you want a list of unique words, you can first create a set and then convert to a list.

l=list(set(open(file).read().split()))

answered Nov 21 '21 at 02:51

Mitchell Olislagers

1,758
1
4
10

how do I write a python one-liner to create a list of all words in a file?

2 Answers2