1

I got a list, in one (the last) column is a string of comma separated items:

temp = ['AAA', 'BBB', 'CCC-DDD', 'EE,FFF,FFF,EE']

Now I want to remove the duplicates in that column.

I tried to make a list out of every column:

    e = [s.split(',') for s in temp]
    print e

Which gave me:

[['AAA'], ['BBB'], ['CCC-DDD'], ['EE', 'FFF', 'FFF', 'EE']]

Now I tried to remove the duplicates with:

    y = list(set(e))
    print y

What ended up in an error

TypeError: unhashable type: 'list'

I'd appreciate any help.

Edit:

I didn't exactly said what the end result should be. The list should look like that

temp = ['AAA', 'BBB', 'CCC-DDD', 'EE', 'FFF']

Just the duplicates should get removed in the last column.

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
Goodybar
  • 13
  • 4

4 Answers4

3

Apply set on the elements of the list not on the list of lists. You want your set to contain the strings of each list, not the lists.

e = [list(set(x)) for x in e]

You can do it directly as well:

e = [list(set(s.split(','))) for s in temp]

>>> e
[['AAA'], ['BBB'], ['CCC-DDD'], ['EE', 'FFF']]

you may want sorted(set(s.split(','))) instead to ensure lexicographic order (sets aren't ordered, even in python 3.7)

for a flat, ordered list, create a flat set comprehension and sort it:

e = sorted({x for s in temp for x in s.split(',')})

result:

['AAA', 'BBB', 'CCC-DDD', 'EE', 'FFF']
cs95
  • 379,657
  • 97
  • 704
  • 746
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • Actually don't need lists in a list at all. Is there a way to apply the set and remove the duplicates without list in list? – Goodybar Jan 11 '19 at 23:06
0

Here is solution, that uses itertools.chain method

import itertools

temp = ['AAA', 'BBB', 'CCC-DDD', 'EE,FFF,FFF,EE']
y = list(set(itertools.chain(*[s.split(',') for s in temp])))
# ['EE', 'FFF', 'AAA', 'BBB', 'CCC-DDD']
Grigoriy Mikhalkin
  • 5,035
  • 1
  • 18
  • 36
0
 a = ['AAA', 'BBB', 'CCC-DDD', 'EE,FFF,FFF,EE']
 b = [s.split(',') for s in a]
 c = []
 for i in b:
     c = c + i
 c = list(set(c))

 ['EE', 'FFF', 'AAA', 'BBB', 'CCC-DDD']
Chris
  • 28,822
  • 27
  • 83
  • 158
0

Here is a pure functional way to do it in Python:

from functools import partial

split = partial(str.split, sep=',')

list(map(list, map(set, (map(split, temp)))))
[['AAA'], ['BBB'], ['CCC-DDD'], ['EE', 'FFF']]

Or as I see the answer doesn't need lists inside of a list:

from itertools import chain

list(chain(*map(set, (map(split, temp)))))
['AAA', 'BBB', 'CCC-DDD', 'EE', 'FFF']
gold_cy
  • 13,648
  • 3
  • 23
  • 45