5

I need to compare two lists which are basically list-of-list find out the sublists which are present in one list but not other. Also the arrangement of the sublists does not consider i.e. ['a','b'] = ['b,'a']. The two lists are

List_1 = [['T_1','T_2'],['T_2','T_3'],['T_1','T_3']]
List_2 = [['T_1','T_2'],['T_3','T_1']]

The output list should be

out_list = [['T_2','T_3']]
Pratik Dutta
  • 125
  • 1
  • 2
  • 11

6 Answers6

4

For two element sublists, this should suffice:

[x for x in List_1 if x not in List_2 and x[::-1] not in List_2]

Code:

List_1 = [['T_1','T_2'],['T_2','T_3'],['T_1','T_3']]
List_2 = [['T_1','T_2'],['T_3','T_1']]

print([x for x in List_1 if x not in List_2 and x[::-1] not in List_2])
Austin
  • 25,759
  • 4
  • 25
  • 48
3

Here's a little messy functional solution that uses sets and tuples in the process (sets are used because what you're trying to calculate is the symmetric difference, and tuples are used because unlike lists, they're hashable, and can be used as set elements):

List_1 = [['T_1','T_2'],['T_2','T_3'],['T_1','T_3']]
List_2 = [['T_1','T_2'],['T_3','T_1']]

f = lambda l : tuple(sorted(l))

out_list = list(map(list, set(map(f, List_1)).symmetric_difference(map(f, List_2))))

print(out_list)

Output:

[['T_2', 'T_3']]
DjaouadNM
  • 22,013
  • 4
  • 33
  • 55
3

I'd say frozensets are more appropiate for such task:

fs2 = set(map(frozenset,List_2))
out = set(map(frozenset,List_1)).symmetric_difference(fs2)

print(out)                                 
# {frozenset({'T_2', 'T_3'})}

The advantage of using frozensets here is that they can be hashed, hence you can simply map both lists and take the set.symmetric_difference.


If you want a nested list from the output, you can simply do:

list(map(list, out))

Note that some sublists might appear in a different order, though given the task should not be a problem

yatu
  • 86,083
  • 12
  • 84
  • 139
  • Hmm yeah i think ur right @MrGeek – yatu Aug 31 '19 at 11:02
  • The issue now is that you don't get a list output, so one way or another, it's gonna look like MrGeek's answer if you follow it through to its entirety – roganjosh Aug 31 '19 at 11:08
  • Yes, my whole point here is that using `sets` or `frozensets` might simply be more appropiate @roganjosh. So not sure what OP wants to do from here, but depending on the task this might be more useful – yatu Aug 31 '19 at 11:09
  • Note that if order is interchangeable as it seems, why not use sets @roganjosh but yeah, if the output has to be a nested lists totally agree – yatu Aug 31 '19 at 11:11
  • Wait, I'm not criticising the use of sets at all. I already posted on Austin's answer before this about using sets. My comment is merely that the approach you have posted may appear more elegant than MrGeek's, but it does not give the `list` output. I don't like the look of his answer, but it does actually get the job done and I think it's necessarily ugly because of the problem itself, not bad coding – roganjosh Aug 31 '19 at 11:14
  • 1
    Yes yes I know you aren't, just exposing my point of view, agreed :) @roganjosh – yatu Aug 31 '19 at 11:14
2

You can convert lists to sets for equality comparison and use any() to add into list only items which doesn't exists in second list:

List_1 = [['T_1', 'T_2'], ['T_2', 'T_3'], ['T_1', 'T_3']]
List_2 = [['T_1', 'T_2'], ['T_3', 'T_1']]
out_list = [l1 for l1 in List_1 if not any(set(l1) == set(l2) for l2 in List_2)]

For better understanding resources consumption and efficiency of each answer I've done some tests. Hope it'll help to choose best.

Results on data from question:

Results on bigger data:

Olvin Roght
  • 7,677
  • 2
  • 16
  • 35
  • what about testing with 1k-10k-1000k, I'm sure that the top will not be the same, with 3 elements the tests are not so relevant – kederrac Aug 31 '19 at 12:42
  • 1
    @rusu_ro1, of course. I've added tests with bigger data. – Olvin Roght Aug 31 '19 at 13:22
  • The timings are interesting but it's not a level playing field. You can see my comments under yatu's answer; it doesn't give back a list. – roganjosh Aug 31 '19 at 13:25
  • Also, Austin's answer flat-out is not extensible. I suspect you have learned something from the timing of your own answer, though :) – roganjosh Aug 31 '19 at 13:27
  • @roganjosh, I've updated functions to give back same results. And about my answer - It's not a surprise for me. Actually, it's one of reasons why I've added tests, cause my solution looks most compact but definitely not most efficient ;) – Olvin Roght Aug 31 '19 at 13:29
  • @roganjosh, also I have not updated my answer cause somehow it will be similar with other answers. – Olvin Roght Aug 31 '19 at 13:32
  • Well that is somewhat admirable. There's always a debate in science about how people not publishing things that don't work is harmful. You did publish. – roganjosh Aug 31 '19 at 13:35
  • @OlvinRoght I've provided a more compact variant of the `frozenset` solution. – GZ0 Aug 31 '19 at 14:22
  • @GZ0, that's great, but in fact there's not so much difference between yours and [this](https://stackoverflow.com/a/57737398/10824407) answer. So I expect result of tests to be really close. Anyway, you show another way to convert list to set using unpacking which can be useful. – Olvin Roght Aug 31 '19 at 14:45
  • @OlvinRoght The last statement of mine can make some difference when the `List_2` is large enough because it avoids converting `map(frozenset,List_2)` into a set. Meanwhile, unpacking iterables into a list is slightly faster than an explicit `list()` call. However, the difference is very small and only becomes significant when the output list size is very small. – GZ0 Aug 31 '19 at 14:50
  • @GZ0, I've added result of tests of your solution and (as expected) they're very close to solution I've pointed you before. – Olvin Roght Aug 31 '19 at 15:22
1

if you do not have duplicates in your lists you can use:

 set(frozenset(e) for e in List_1).symmetric_difference({frozenset(e) for e in List_2})

output:

{frozenset({'T_2', 'T_3'}), frozenset({1, 2})}

if you need a list of lists as output you can use:

[list(o) for o in output]

ouptut:

[['T_2', 'T_3']]
kederrac
  • 16,819
  • 6
  • 32
  • 55
0

Here is a one-liner variant of the frozenset solutions from @yatu and @rusu_ro1 for those who prefer a more concise syntax:

out = [*map(list,{*map(frozenset,List_1)}^{*map(frozenset,List_2)})]

If it is not required to convert the output into a nested list, just do

out = {*map(frozenset,List_1)}^{*map(frozenset,List_2)}

Meanwhile, one advantage of using the symmetric_difference function rather than the ^ operator is that the former can take any iterable as its argument. This avoids converting map(frozenset,List_2) into a set and therefore gains some performance.

out = [*map(list,{*map(frozenset,List_1)}.symmetric_difference(map(frozenset,List_2)))]
GZ0
  • 4,055
  • 1
  • 10
  • 21