1

I'm a newbie to python. Currently i'm learning about list. I try to add every words from the "words.txt" file to a list. But when I tried

words += word
every characters becomes an element of the list. I tried
 words += [word] 
and it worked. But I want to know why the first way makes every characters an element and not words?
fhand = open("words.txt")
words = list()
for line in fhand:
    for word in line.split():
        words += [word]
print(words)
  • List is a mutable object and string is an immutable and iterator object. So you are adding an iterator to a list. Hence, it is obvious that string gets iterated and then added each element to a list. – Abhishek Kulkarni Feb 01 '19 at 06:24

6 Answers6

1

When you want to add word into list as an element.

usually use .append()

fhand = open("words.txt")
words = list()
for line in fhand:
    for word in line.split():
        words.append(word)
print(words)
Roy Lee
  • 329
  • 1
  • 2
  • 12
1

Word is a string, which is itself a collection of objects(characters), if you used word[0], you will get the 1st element in the word, by default python lists maintain data types, so a collection of characters remains a collection of characters when you append it to the list and results in a list of characters, in the second case you are explicitly declaring that you want to append [word] to the list, and not it's characters, so it becomes a list of strings. If that is still not clear feel free to comment.

anand_v.singh
  • 2,768
  • 1
  • 16
  • 35
1

you only can add list to list, so when you add string to list you treat the string as list of characters so it adds the characters as elements, in second way you have declared that you have list and the word is element itself so it adds the whole word as element.

sara sara
  • 31
  • 5
0

In python, a string itself is internally a list of 'unicode' characters, albeit considered a different datatype. So when you do words += word it appends each new character to the empty list. But when you do words += [word] , [word] is considered a list of one single string, so it appends only one item to the empty list

BobLoblaw
  • 1,873
  • 4
  • 18
  • 27
0

The += operator on a list is equivalent to calling its extend method, which takes an iterable as an argument and appends each item to the list. With words += word, the right hand operand of += is a string, which is an iterable, so would be equivalent to writing words.extend(word).

John B
  • 3,566
  • 1
  • 16
  • 20
0

Lets go through your code:

Consider words.txt consists of the following text:

hello, I am Solomon
Nice to meet you Solomon

So, you first open this file with fhand = open("words.txt"), then you initialize a list called words:

fhand = open("words.txt")
words = list()

Suggestion: Here its advisable to use the with context manager to open the file. That way, you wouldn't have to close the file explicitly later. If you are just using open() as above, you'd have to close the file in the end with fhand.close().

with open("words.txt", 'r') as fhand:
    #<--code--->

In the next line, you iterate over each line in fhand. Lets print line which basically shows each line in the text:

for line in fhand:
    print(line)
#Output:
hello, I am Solomon

Nice to meet you Solomon

Then you are iterating over line.split() which splits the above lines of text into individual lists of words. If we print line.split():

for line in fhand:
    print(line.split())
#Output:
['hello,', 'I', 'am', 'Solomon']
['Nice', 'to', 'meet', 'you', 'Solomon']

Suggestion: You could also make use of splitlines() to break each line(boundary) into a separate list. This is different from split() as it does not break each line into words. This method also preserves whitespaces, so you will have to get rid of them with strip(' ') if your text has any whitespaces in the end or beginning. This method has no side effects and you can still use it:

for line_str in fhand:
    print(line_str.strip(' ').splitlines())
    #Output:
    ['hello, I am Solomon']
    ['Nice to meet you Solomon']
    for line in line_str.strip(' ').splitlines(): #watch the indentation
        print(line.split())
        #Output:
        ['hello,', 'I', 'am', 'Solomon']
        ['Nice', 'to', 'meet', 'you', 'Solomon']

In the next piece of code you are iterating over each (word? or rather letter) in line.split() (as you know we received a list of words with this method before) and then incrementing words with the set of letters for each word. So, basically you get a set of letters because you iterated over each word in the lists:

for word in line.split():
    words+=word
#Output:
['h', 'e', 'l', 'l', 'o', ',', 'I', 'a', 'm', 'S', 'o', 'l', 'o', 'm', 'o', 'n', 'N', 'i', 'c', 'e', 't', 'o', 'm', 'e', 'e', 't', 'y', 'o', 'u', 'S', 'o', 'l', 'o', 'm', 'o', 'n']

But most likely you are expecting a list of words in a single list words. We can achieve this with the append() method as it takes each word in line.split() and simply appends(or adds to the end of the list) to words:

for word in line.split():
    words.append(word)
#Output:
['hello,', 'I', 'am', 'Solomon', 'Nice', 'to', 'meet', 'you', 'Solomon']

And then when we look at the other variation words += [word]:

for word in line.split():
    words += [word]
print(words)
#Output:
['hello,', 'I', 'am', 'Solomon', 'Nice', 'to', 'meet', 'you', 'Solomon']

This has the same effect as append(). Why is that so? Lets print [word] which is nothing but a list of each word. This is expected because you are taking each word from line.split() and then concatenating to words :

print([word])
#Output:
['hello,']
['I']
['am']
['Solomon']
['Nice']
['to']
['meet']
['you']
['Solomon']

words += [word] is equivalent to words = words + [word]. To see how this concatenation works, consider the following example which is equivalent to this statement:

words = list()
word = ["Hello"]
concat_words = words + word
print(concat_words)
#['Hello']
another_word = ["World"]
concat_some_more_words = words + another_word
print(concat_some_more_words)
#['World']
final_concatenation = concat_words + concat_some_more_words
print(final_concatenation)
#Output:
['Hello', 'World']

Lets try append() on this example:

words1 = list()
words_splitted = ["Hello", "World"]
for word in words_splitted:
  words1.append(word)
print(words1)
#['Hello', 'World']

This shows that concatenation is equivalent to appending but it is recommended practice to use append() for lists:

print(words1==final_concatenation)
#True

Returning back to the original question, let's make the whole code more compact using list comprehensions:

with open("words.txt", 'r') as fhand:
    words = [word for line in fhand for word in line.split()]
print(words)
#Output:
['hello,', 'I', 'am', 'Solomon', 'Nice', 'to', 'meet', 'you', 'Solomon']

You will notice I've used the with context manager to leave file open/close to Python after the job is done(exits the context). Next, I've created a list words with the same loops inside. This is also called a list comprehension and is one of the most powerful features in Python. This makes the code more compact, easy to read and faster than appending.

Finally, initializing words = [] is much more cleaner than words = list(). It is also much faster.

amanb
  • 5,276
  • 3
  • 19
  • 38
  • Thank you for a very clear and constructive answer!!! I really appreciate your time and effort. –  Feb 01 '19 at 09:51