3

I have a text file which consists of names of different countries as follows:

enter image description here

I have to extract all these names and store them inside a list using python.

Python Code:

with open('a.txt') as x:
    b = [word for line in x for word in line.split()]
    print(b)

Problem: The above python codes works absolutely fine but the only problem is that if it finds any space between any 2 words, it is storing them as two separate words in a list. Whereas, I want to retrieve the names line by line and store that entire word as a single word.

For Example: I want to store the word Antigua & Deps as a single word inside the list. Whereas, it is storing it as 3 different words.

Can anyone please help me out with this problem?

Vivek
  • 336
  • 2
  • 4
  • 18

3 Answers3

3

You can directly use readlines on your file handler:

with open('a.txt') as x:
    b = x.readlines()

This will have trailing newline character at the end of each line which you can avoid by:

with open('a.txt') as x:
    b = [line.strip() for line in x]

Why do you get your output?

You are doing for word in line.split() which infact is splitting on whitespaces in each line.

Vivek
  • 336
  • 2
  • 4
  • 18
Austin
  • 25,759
  • 4
  • 25
  • 48
  • Thanks @Vivek for the suggestion. I would say descriptive names were more meaningful though. – Austin Mar 30 '20 at 08:54
2

you can just keep the line:

b = [line.strip() for line in x]
Vivek
  • 336
  • 2
  • 4
  • 18
kederrac
  • 16,819
  • 6
  • 32
  • 55
1

If you want to read line by line and you want control some thing then you can do this code

List = [] 
f = open("countries.txt", "r")
for x in f:
  List.append(x.strip())

f.close()

print(List)
Dickens A S
  • 3,824
  • 2
  • 22
  • 45
  • I have tried this method. But the problem is that \n is being added at the end of each element which makes things tough from the next part of the code. – Vivek Mar 29 '20 at 12:14
  • add `.strip()` to the string which remove the CRLF, modified the code – Dickens A S Mar 29 '20 at 12:17