Select some words from an existing file in Python

Question

I want to separate the exact words of a text file (text.txt) ending in a certain string by using 'endswith'. The fact is that my variable

h=[w for w in 'text.txt' if w.endswith('os')]

does not give what I want when I call it. On the other hand, if I try naming the open file

f=open('text.txt')
h=[w for w in f if w.endswith('os')]

does not work either. Should I convert the text into a list first?

Comment: I hope this is not duplicate. It is not the same as this former question although close to it.

just save the contents of one row first to a variable. You might be surprised by whats the at end of a line where you THINK it ends with 'os'. — Paritosh Singh, Nov 27 '18 at 15:48
Everyone of your answers work. Thank you. My question is why should I split the document before isolating the words ending in 'os'. — gibarian, Nov 27 '18 at 17:02
Each line `(w for w in f)` of the file will have a hidden newline character such as `'\n'`. So, none of the strings end with os. because they may end with `'os\n'` or something of that sort. While I've told you what's happened here, i highly recommend just looking at the variables, say with `h = [w for w in f]` and then just watch `h[0]` and see the actual line in memory. — Paritosh Singh, Nov 27 '18 at 17:16

score 3 · Accepted Answer · answered Nov 27 '18 at 15:51

3

Open the file first and then process it like this:

with open('text.txt', 'r') as file:
        content = file.read()
h=[w for w in content.split() if w.endswith('os')]

answered Nov 27 '18 at 15:51

Pedro Borges

1,240
10
20

Martin Frodl · Answer 2 · 2018-11-28T08:14:00.077

1

with open('text.txt') as f:
    words = [word for line in f for word in line.split() if word.endswith('os')]

Your first attempt does not read the file, or even open it. Instead, it loops over the characters of the string 'text.txt' and checks each of them if it ends with 'os'.

Your second attempt iterates over lines of the file, not words -- that's how a for loop works with a file handle.

edited Nov 28 '18 at 08:14

answered Nov 27 '18 at 15:47

Martin Frodl

667
4
11

score 1 · Answer 3 · answered Nov 27 '18 at 15:51

1

Splitting the seperate words into a list (assuming they are seperated by spaces)

f = open('text.txt').read().split(' ')

Then to get a list of the words ending in "os", like you had:

h=[w for w in f if w.endswith('os')]

answered Nov 27 '18 at 15:51

degenTy

340
1
9

By splitting the words on spaces and nothing else, you fail to find words ending in `'os'` but followed by another whitespace character, such as line break. – Martin Frodl Nov 28 '18 at 08:24

score 0 · Answer 4 · answered Nov 27 '18 at 15:51

f=open('text.txt')
h=[w for w in f if w.endswith('os')]

This should work properly. Reasons it may be not working for you,

You should strip the line first. There may be hidden ascii chars, like "\n". You can use rstrip() method for that. Something like this.

h=[w.rstrip() for w in f if w.rstrip().endswith('os')]

After reading the file once, the w pointer reaches the End Of File (EOF), and hence any more read-operations will be in vain. To move the pointer back to the starting of the file, either use seek method, or re-open the file.

Select some words from an existing file in Python

4 Answers4

Linked