1

I have a dict like this:

dict = defaultdict(list, {'a': [['1', '2', 'A', 'cat'],
                               ['1', '3', 'A', 'dog']], 
                          'b': [['1', '2', 'A', 'cat'],
                               ['1', '3', 'A', 'dog']],
                          'c': [['1', '2', 'A', 'cat'],
                               ['2', '2', 'A', 'snake'],
                               ['2', '2', 'A', 'bird']]}

I'd like to get all pairwise comparisons for overlapping values using the full list for each value. (Every position in the value list must match for it to be considered a match between keys)

Since a and b share ['1', '3', 'A', 'dog'] and c doesn't, a/b: ['1', '3', 'A', 'dog'].

a, b, c, all share ['1', '2', 'A', 'cat'], a/b/c: ['1', '2', 'A', 'cat'].

Only c has ['2', '2', 'A', 'snake'], so c: ['2', '2', 'A', 'snake']

Preferred output is a dictionary combining the above, something like

combine_dict = {'a/b': ['1', '3', 'A', 'dog'], 'a/b/c': ['1', '2', 'A', 'cat'], 'c': [['2', '2', 'A', 'snake'], ['2', '2', 'A', 'bird']]}
Liquidity
  • 625
  • 6
  • 24

1 Answers1

2

You can use collections.defaultdict:

import collections
d = {'a': [['1', '2', 'A', 'cat'], ['1', '3', 'A', 'dog']], 'b': [['1', '2', 'A', 'cat'], ['1', '3', 'A', 'dog']], 'c': [['1', '2', 'A', 'cat'], ['2', '2', 'A', 'snake'], ['2', '2', 'A', 'bird']]}
new_d = collections.defaultdict(list)
for a, b in d.items():
  for i in b:
     new_d[tuple(i)].append(a)


new_r = collections.defaultdict(list)
for a, b in new_d.items():
   new_r['/'.join(b)].append(list(a))

new_result = {a:b[0] if len(b) == 1 else b for a, b in new_r.items()}

Output:

{'a/b/c': ['1', '2', 'A', 'cat'], 'a/b': ['1', '3', 'A', 'dog'], 'c': [['2', '2', 'A', 'snake'], ['2', '2', 'A', 'bird']]}
Ajax1234
  • 69,937
  • 8
  • 61
  • 102
  • Could you explain `new_result = {'/'.join(b):list(a) for a, b in new_d.items()}` a little more? I tried to write it out as `for a, b in combination_dict.items(): combination_list = '/'.join(b):list(a)` to understand what's going on but that code has invalid syntax. – Liquidity Jun 18 '19 at 02:30
  • @Liquidity `{'/'.join(b):list(a) for a, b in new_d.items()}` is a [dictionary comprehension](https://stackoverflow.com/questions/14507591/python-dictionary-comprehension). Please see my recent edit, as I added code for the creation of `new_result` in a more readable way – Ajax1234 Jun 18 '19 at 02:34
  • Thank you for the clarification! I think I've found a problem - if there are more than two values, this only saves the last. I updated my example so that `c` should output `'c': ['2', '2', 'A', 'snake'], ['2', '2', 'A', 'bird']`, but using your code only outputs `'c': ['2', '2', 'A', 'bird']`. You answered the original example fine, I should've been clearer that I wanted all values saved even if more than one was shared or not shared! Should I open that as a new question? – Liquidity Jun 18 '19 at 02:39