3

I have a list word_list = ['cat', 'dog', 'rabbit']. I want to use list comprehension to print each individual character from the list but removes any duplicate character. This is my code:

word_list = ['cat', 'dog', 'rabbit']
letter_list = [""]
letter_list = [letter for word in word_list for letter in word if letter not in letter_list ]
print(letter_list)

this returns ['c', 'a', 't', 'd', 'o', 'g', 'r', 'a', 'b', 'b', 'i', 't'] which is not the desired result ['c', 'a', 't', 'd', 'o', 'g', 'r', 'b', 'i'] and I can't figure out why.

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
  • BTW, what's the reason for putting an empty string in `letter_list`? – Barmar Aug 29 '23 at 01:06
  • [Similar question](https://stackoverflow.com/q/73753000/12671057) (not closing this as duplicate due to your implicit "why" question about your attempt's failure). – Kelly Bundy Aug 29 '23 at 03:21

6 Answers6

4

You can't do this with a list comprehension, because letter not in letter_list refers to the original value of letter_list. It's not getting updated as you go, because the letter_list = assignment doesn't happen until the list comprehension completes.

Use an ordinary for loop so you can update as you go.

letter_list = []
for word in word_list:
    for letter in word:
        if letter not in letter_list:
            letter_list.append(letter)
Barmar
  • 741,623
  • 53
  • 500
  • 612
4

It is technically possible to implement deduplication with a list comprehension if initialization of some variables are allowed.

You can use a set seen to keep track of letters already encountered, and a set include to record whether the current letter was already seen before it is added to the set seen:

seen = set()
include = set()
print([
    letter for word in word_list for letter in word
    if (
        include.clear() if letter in seen else include.add(1),
        seen.add(letter)
    ) and include
])

Since Python 3.8 you can also use an assignment expression to avoid having to rely on side effects of functions, which are generally discouraged in a list comprehension:

seen = set()
print([
    letter for word in word_list for letter in word
    if (
        include := letter not in seen,
        seen := seen | {letter}
    ) and include
])

But if you are not dead set on implementing the deduplication with a list comprehension, it would be cleaner to use the dict.fromkeys method instead since dict keys are always unique and follow insertion order since Python 3.7:

from itertools import chain
print([*{}.fromkeys(chain(*word_list))])

Demo: Try it online!

blhsing
  • 91,368
  • 6
  • 71
  • 106
2

When you refer to letter_list from the list comprehension, it's referring to the contents of letter_list before the list comprehension, i.e. [""]. Instead, you can use a set.

letter_list = {letter for word in word_list for letter in word}

This gives the unique letters in no particular order.

In particular the list comprehension is not the same as

word_list = ['cat', 'dog', 'rabbit']
letter_list = [""]
for word in word_list:
    for letter in word:
        if letter not in letter_list:
            letter_list.append(letter)

An equivalent would be

word_list = ['cat', 'dog', 'rabbit']
letter_list = [""]

temp_letter_list = [""]
for word in word_list:
    for letter in word:
        if letter not in letter_list:
            temp_letter_list.append(letter)
letter_list = temp_letter_list

Note the the use of a temporary list to represent the list comprehension.

If you want a list rather than a set, use list(letter_list) and if you want the letters in order use sorted(letter_list)

David Waterworth
  • 2,214
  • 1
  • 21
  • 41
1

Lists are perfectly happy to keep duplicates. So, make a set out of it.

word_list = ['cat', 'dog', 'rabbit']
letter_list = list({letter for word in word_list for letter in word})
print(letter_list)
Tim Roberts
  • 48,973
  • 4
  • 21
  • 30
  • Whilst the answer works, it's important to note doing it this way causes you to lose the order of the letters. – PacketLoss Aug 29 '23 at 01:03
1

You can actually build the array with all the list and finally cast it as a set, so that will delete every duplicate letter from the list

word_list = ['cat', 'dog', 'rabbit']
letter_list = [letter for word in word_list for letter in word]
print(list(set(letter_list))) 
0

In an assignment like myList = [ ... ] the whole list on the right side is built before the variable is assigned so you can't see the new content in the comprehension.

But if you use the extend method, which expects an iterable as parameter, each item in the iterable is added as it is fetched so your comprehension will have visibility on items of the target list as they are added.

This is very close to what you already had with only the method being different:

word_list = ['cat', 'dog', 'rabbit']

letter_list = []
letter_list.extend(letter  for word   in word_list 
                           for letter in word if letter not in letter_list)

print(letter_list)
['c', 'a', 't', 'd', 'o', 'g', 'r', 'b', 'i'] 

A more efficient approach is to use a dictionary to ensure that the letters are unique (while keeping the original order):

word_list = ['cat', 'dog', 'rabbit']

*letter_list, = dict.fromkeys(letter for word in word_list for letter in word)

print(letter_list)
['c', 'a', 't', 'd', 'o', 'g', 'r', 'b', 'i'] 
Alain T.
  • 40,517
  • 4
  • 31
  • 51
  • I like the extend-trick, too, but you're really obsessed with it, aren't you? :-) How many times do you think you've used it? And have you considered `+=`? – Kelly Bundy Aug 29 '23 at 03:28
  • No, I never considered `+=`, I always assumed it would pre-build the list before extending. But I tested it just now and it does work like `.extend()` with an iterator. That's going to be my new golden hammer from now on. Thanks. – Alain T. Aug 29 '23 at 11:51