3

I'm trying to write simpler code for adding unique elements into a python list. I have a dataset that contains a list of dictionaries, and I'm trying to iterate through a list inside the dictionary

Why doesn't this work? It's adding all the items, including the duplicates, instead of adding unique items.

unique_items = []
unique_items = [item for d in data for item in d['items'] if item not in unique_items]

vs. the longer form which works:

unique_items = []
for d in data:
    for item in d['items']:
        if (item not in unique_items):
            unique_items.append(item)

Is there a way of making this work using list comprehension, or am I stuck with using double for loops? I want to keep the ordering for this.

Here's the list of dictionaries:

[{"items":["apple", "banana"]}, {"items":["banana", "strawberry"]}, {"items":["blueberry", "kiwi", "apple"]}]

output should be ["apple", "banana", "strawberry", "blueberry", "kiwi"]

I noticed someone asking a similar question on another post: Python list comprehension, with unique items, but I was wondering if there's another way to do it without OrderedDict or if that's the best way

Community
  • 1
  • 1
user3226932
  • 2,042
  • 6
  • 39
  • 76

3 Answers3

3

all_items isn't continuously overwritten during the list comprehension, so you're constantly looking for things in an empty list.

I would do this instead:

data = [1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 1, 2, 3, 4,]

items = []
_ = [items.append(d) for d in data if d not in items]
print(items)

and I get:

[1, 2, 3, 4, 5, 6]

But there are more efficient ways to do this anyway.

Paul H
  • 65,268
  • 20
  • 159
  • 136
2

Why not just use set?

e.g. -

>>> data = {1: {'items': [1, 2, 3, 4, 5]}, 2: {'items': [1, 2, 3, 4, 5]}}
>>> {val for item in data for val in data[item]['items']}
>>> {1, 2, 3, 4, 5}

If you want a list:

>>> list(repeat above)
>>> [1, 2, 3, 4, 5]

Instead of the curly braces {} for the set you could also just use the set keyword, since the braces may be overly obscure for some.

Here's a link to the syntax

Pythonista
  • 11,377
  • 2
  • 31
  • 50
  • It might be worth to note that this doesn't preserve the ordering like the example code in question or the answer to http://stackoverflow.com/questions/12681753/python-list-comprehension-with-unique-items – niemmi May 14 '16 at 00:49
1

The easiest way is to use OrderedDict:

from collections import OrderedDict
from itertools import chain

l = [{"items":["apple", "banana"]}, {"items":["banana", "strawberry"]}, {"items":["blueberry", "kiwi", "apple"]}]
OrderedDict.fromkeys(chain.from_iterable(d['items'] for d in l)).keys() # ['apple', 'banana', 'strawberry', 'blueberry', 'kiwi']

If you want alternatives check OrderedSet recipe and package based on it.

niemmi
  • 17,113
  • 7
  • 35
  • 42