2

I have 2 nested lists and I want to write some code that will run through each sub-list in both lists, and group together any elements that appear together in both lists. The project I am working on actually has huge nested lists, so I have created the following 2 lists to simplify the problem a little (I only have a year of experience in python). If a function can be made that groups together elements in these 2 lists, I can then apply the function to the actual project. This question may be similar to: Find items that appear together on multiple lists, but I could not understand the code written in that question, and as I said, I'm relatively new to python.

my_list = [['a', 'd', 'l'], ['c', 'e', 't'], ['q', 'x'], ['p', 'f', 'd', 'k']

sec_list = [['f', 'd', 'w', 'a'], ['c', 'e', 'u', 'h'], ['q', 'x', 'd', 'z'], ['p', 'k']]

##The output should be something like:

[['a', 'd'], ['c', 'e'], ['q', 'x'], ['p', 'k'], ['f', 'd']]```

Thanks
ALI_000678
  • 23
  • 2
  • Probably a typo - should the last `['f', 'd']` be in output? – Andrej Kesely Apr 23 '21 at 17:16
  • Yes, because 'f' and 'd' appear together in the 4th sublist in my_list and in the 1st sublist of sec_list. Sorry if I didn't make it clear, I want the code to check ALL the sublists to see if any items appear together – ALI_000678 Apr 23 '21 at 18:10
  • Shouldn't be the output then: `[['d', 'a'], ['d'], ['e', 'c'], ['x', 'q'], ['d', 'f'], ['d'], ['k', 'p']]` ? Note the `['d']` – Andrej Kesely Apr 23 '21 at 18:15
  • Ah yes, although my intention is to remove that later anyway, because I want to find groups of values that appear together in 2 different lists, not single values. But yes, for the time being, `['d']` is a valid output. – ALI_000678 Apr 23 '21 at 18:20

3 Answers3

2

You can use zip to iterate over two sequences and find common elements with set intersection. Note that your code is missing a closing ] in my_list

my_list = [['a', 'd', 'l'], ['c', 'e', 't'], ['q', 'x'], ['p', 'f', 'd', 'k']]
sec_list = [['f', 'd', 'w', 'a'], ['c', 'e', 'u', 'h'], ['q', 'x', 'd', 'z'], ['p', 'k']]

# each item of my_list and sec_list are lists
# zip allows parallel iteration so l1 and l2 are the pairs of inner lists
# sets are designed for tasks like finding common elements
# the & sign is python for set intersection 
matches = []
for l1, l2 in zip(my_list, sec_list):
    matches.append(list(set(l1) & set(l2)))

this can be consolidated into a list comprehension

my_list = [['a', 'd', 'l'], ['c', 'e', 't'], ['q', 'x'], ['p', 'f', 'd', 'k']]
sec_list = [['f', 'd', 'w', 'a'], ['c', 'e', 'u', 'h'], ['q', 'x', 'd', 'z'], ['p', 'k']]
matches = [list(set(l1) & set(l2)) for l1, l2 in zip(my_list, sec_list)]
Eric Truett
  • 2,970
  • 1
  • 16
  • 21
  • Thanks, I was not aware of the zip command, which proved very useful here. Can you use zip to iterate over 3 sequences at the same time? – ALI_000678 Apr 23 '21 at 17:56
  • Slight issue, my fault for not making this clear in the question, but the code should also output `[['f'], ['d'] ` because 'f' and 'd' appear together in the 4th sublist of my_list and the 1st sublist of sec_list. – ALI_000678 Apr 23 '21 at 18:13
0

If you want to keep duplicates, you can use this solution using collections.Counter:

from collections import Counter

my_list = [['a', 'd', 'l'], ['c', 'e', 't'], ['q', 'x'], ['p', 'f', 'd', 'k', 'k']]

sec_list = [['f', 'd', 'w', 'a'], ['c', 'e', 'u', 'h'], ['q', 'x', 'd', 'z'], ['p', 'k', 'k']]

    
result = [list((Counter(a) & Counter(b)).elements()) for a in my_list for b in sec_list]
result = [x for x in result if len(x) > 0]
    
print(result)

Output:

[['a', 'd'], ['d'], ['c', 'e'], ['x', 'q'], ['f', 'd'], ['d'], ['k', 'k', 'p']]                                                                                                                                                                                                         

Update based on comments.

Mady Daby
  • 1,271
  • 6
  • 18
  • Similar to the answer above, it works great but does not give ALL the values. For example, the code doesn't output `['f', 'd']`. This is a valid output because 'f' and 'd' appear together in the 4th sublist of my_list and the first sublist of sec_list. Sorry I didn't make this clear in the question – ALI_000678 Apr 23 '21 at 18:17
  • Sorry, I thought that was a typo. I updated my answer based on the comments. – Mady Daby Apr 23 '21 at 19:21
0

Based on the comments:

my_list = [["a", "d", "l"], ["c", "e", "t"], ["q", "x"], ["p", "f", "d", "k"]]
sec_list = [
    ["f", "d", "w", "a"],
    ["c", "e", "u", "h"],
    ["q", "x", "d", "z"],
    ["p", "k"],
]

# to speed up, convert the sublists to sets
tmp1 = [set(i) for i in my_list]
tmp2 = [set(i) for i in sec_list]

out = []
for s1 in tmp1:
    for s2 in tmp2:
        m = s1 & s2
        if m:
            out.append(list(m))
print(out)

Prints:

[['d', 'a'], ['d'], ['e', 'c'], ['x', 'q'], ['d', 'f'], ['d'], ['k', 'p']]
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91