0

Suppose that I have a list of lists, e.g.

example_list = [[0, 0], [0, 1], [0, 1], [5, 4]]

I want a reasonably fast method of obtaining a list formed exclusively of elements that appear at least twice in the original list. In this example, the new list would be

new_list = [[0, 1]]

since [0, 1] is the only duplicate entry. I have spent lots of time on Stackoverflow looking for a solution, but none of them seem to work for me (details below). How should I proceed in this instance?


Unsuccessful attempts. One solution which does work is to write something like

new_list = [x for x in example_list if example_list.count(x) > 1]

However, this is too slow for my purposes.

Another solution (suggested here) is to write

totals = {}
for k,v in example_list:
  totals[k] = totals.get(k,0) + v
totals.items()
[list(t) for t in totals.items()]
print(totals)

I may have misunderstood what the author is suggesting, but this doesn't work for me at all: it prints {0: 2, 5: 4} in the terminal.

A final solution (also suggested on this page) is import Counter from collections and write

new_list = Counter(x for x, new_list in example_list for _ in xrange(new_list))
map(list, new_list.iteritems())

This flags an error on xrange and iteritems (I think it's a Python3 thing?), so I tried

new_list = Counter(x for x, new_list in example_list for _ in range(new_list))
map(list, new_list.items())

which yielded Counter({5: 4, 0: 2}) (again!!), which is of course not what I am after...

afreelunch
  • 130
  • 6

2 Answers2

1

You can use Counter to create a dictionary of counts of elements in example_list. But each element should be converted to a tuple to make it hashable. Then, you can filter the elements that meet your criteria.

from collections import Counter


d = Counter([tuple(x) for x in example_list])
[list(k) for k, v in d.items() if v >= 2]
# [[0, 1]]
d.b
  • 32,245
  • 6
  • 36
  • 77
  • @d.b. Thanks, but that appears to produce [0, 1], as opposed to [[0, 1]]. Is it easy to fix this? It's a bit of an issue. For example, if the example list is [[0, 0], [0, 0], [1, 1], [1, 1]], then I want to get [[0, 0], [1, 1]], but your code gives me [0, 1]! – afreelunch Jul 08 '22 at 17:30
0

You could count the values of the inner lists. First, is to iterate the list - but really you want to iterate the values of the outer list. itertools.chain.from_iterable does that for you. Feed it in collections.Counter and you get a count of all of the values. A list comprehension can select the values you want and then place that in an outer list.

>>> example_list = [[0, 0], [0, 1], [0, 1], [5, 4]]
>>> import collections
>>> import itertools
>>> counts = collections.Counter(itertools.chain.from_iterable(example_list))
>>> counts
Counter({0: 4, 1: 2, 5: 1, 4: 1})
>>> selected = [k for k,v in counts.items() if v >= 2]
>>> result = [selected]
>>> result
[[0, 1]]
tdelaney
  • 73,364
  • 6
  • 83
  • 116