Unique lists within list of lists if those lists have list as one of the elements

Question

If I have:

l = [['98765', ['Einstein, A', 'SFEN'], 'SSW 540', 3], ['98765', ['Einstein, A', 'SFEN'], 'SSW 540', 3],
     ['98764', ['Feynman, R', 'SFEN'], 'SSW 564', 3], ['98764', ['Feynman, R', 'SFEN'], 'SSW 564', 3]]

What would be the best way to get:

k = [['98765', 'Einstein, A', 'SFEN', 'SSW 540', 3], ['98764', 'Feynman, R', 'SFEN', 'SSW 564', 3]]

If I try:

uniqinstruct = set(map(tuple, l))

I get TypeError: unhashable type: 'list'. I don't want to remove all layers of nesting, because that would just combine everything into one list:

output = []

def reemovNestings(l):
    for i in l:
        if type(i) == list:
            reemovNestings(i)
        else:
            output.append(i)

reemovNestings(l)
print(sorted(set(output), key=output.index))

Output:

['98765', 'Einstein, A', 'SFEN', 'SSW 540', 3, '98764', 'Feynman, R', 'SSW 564']

If two instructors have the same count (i.e. 3 in this case), then only one 3 remains because it's a set, and I can't group the elements of the list by every x intervals. What would be a good way to preserve that last value?

Could we merge those list with the same number like `"98765"` or `"98764"`? — jizhihaoSAMA, Nov 20 '20 at 02:08
Does https://stackoverflow.com/questions/952914/how-to-make-a-flat-list-out-of-list-of-lists help? — Karl Knechtel, Nov 20 '20 at 02:10
@jizhihaoSAMA That number is a unique identifier, but could be repeated if an instructor teaches multiple courses (with different course names and counts), so would not be good to merge — Jim T, Nov 20 '20 at 03:05
@KarlKnechtel Looks like `flatten` from `django` and then collecting by every 5 elements could work as well — Jim T, Nov 20 '20 at 03:10

Muslimbek Abduganiev · Accepted Answer · 2020-11-20T02:25:23.140

2

Given that you know which layer you want to unwrap, you could just iterate through that layer. In your particular example, it's the second layer:

res = []
for inner_list in l:
    inner = []
    for el in inner_list:
        if type(el) == list:
            inner.extend(el)
        else:
            inner.append(el)
    if not (inner in res):
        res.append(inner)

Note that list.extend adds multiple values to the list.

if not (inner in res): res.append(inner) gives you unique items in the top layer. Thanks to @dmitryro for the tip.

edited Nov 20 '20 at 02:25

answered Nov 20 '20 at 02:09

Muslimbek Abduganiev

828
7
21

you probably want to check `if not (inner in res): res.append(inner)` as he's only interested in non-repeating items. – dmitryro Nov 20 '20 at 02:20
Yep, you are right. I got blind there and realized the need for non-repeating elements after I finished the code snipped. Updated the answer. – Muslimbek Abduganiev Nov 20 '20 at 02:28

score 1 · Answer 2 · answered Nov 20 '20 at 03:04

use itertools.groupby to divided them, and flatten them by a list comprehension. To ensure the order of list, you could use dict.fromkeys().

If you don't mind this too long list comprehension:

from itertools import groupby

l = [['98765', ['Einstein, A', 'SFEN'], 'SSW 540', 3], ['98765', ['Einstein, A', 'SFEN'], 'SSW 540', 3],
     ['98764', ['Feynman, R', 'SFEN'], 'SSW 564', 3], ['98764', ['Feynman, R', 'SFEN'], 'SSW 564', 3]]

s = [list(dict.fromkeys(e for i in item for j in i for e in (j if type(j) is list else [j])).keys()) for _, item in groupby(l)]
print(s)

Result:

[['98765', 'Einstein, A', 'SFEN', 'SSW 540', 3], ['98764', 'Feynman, R', 'SFEN', 'SSW 564', 3]]

Python 3.6 + dictionary will keep the order of insert. – jizhihaoSAMA Nov 20 '20 at 03:14 — jizhihaoSAMA, Nov 20 '20 at 03:14

score 0 · Answer 3 · answered Nov 20 '20 at 02:17

0

Use Numpy.unique with flatten list.

np.unique(flattened, axis=0)

answered Nov 20 '20 at 02:17

Canasta

228
1
6

Change `print(sorted(set(output), key=output.index))` to `print(np.unique(output, axis=0))` – Canasta Nov 20 '20 at 02:19

score 0 · Answer 4 · answered Nov 20 '20 at 02:28

Here's another possible solution.

First, flatten the original list:

def flatten(s):
    if s == []:
        return s
    if isinstance(s[0], list):
        return flatten(s[0]) + flatten(s[1:])
    return s[:1] + flatten(s[1:])

Your original input:

l = [['98765', ['Einstein, A', 'SFEN'], 'SSW 540', 3], 
     ['98765', ['Einstein, A', 'SFEN'], 'SSW 540', 3],
     ['98764', ['Feynman, R', 'SFEN'], 'SSW 564', 3], 
     ['98764', ['Feynman, R', 'SFEN'], 'SSW 564', 3]]

Now let's flatten and iterate over it

final = [] # the resulting list

items = flatten(l) # first flatten the input

# take every 5 elements and add them to the final list, if not there yet.
for i in range(0, len(items), 5):
    if not (items[i:i+5] in final):
        final.append(items[i:i+5])

#let's print [['98765', 'Einstein, A', 'SFEN', 'SSW 540', 3], 
              ['98764', 'Feynman, R', 'SFEN', 'SSW 564', 3]]
print(final)

Unique lists within list of lists if those lists have list as one of the elements

4 Answers4