-1

I have an arbitrary python dictionary x such that the values of each key is itself a list. Here is an example:

x = {"first_name": ["Habib", "Wen-lao"], "second_name": ["Khan", "Chen"]}

Given x, I would like write a method that computes a list of dictionaries such that each dictionary has the same keys as x but the values are each combination of individual list element.

Additionally, I would like to see all combinations that omit the keys entirely.

So in this case, the result should be:

[{"first_name": "Habib", "second_name": "Khan"}, 
 {"first_name": "Habib", "second_name": "Chen"}, 
 {"first_name": "Habib"}, 
 {"first_name": "Wen-lao", "second_name": "Khan"}, 
 {"first_name": "Wen-lao", "second_name": "Chen"}
 {"first_name": "Wen-lao"},
 {"second_name": "Khan"}, 
 {"second_name": "Chen"},
 {}]

How can I do it? The dictionary x may have any arbitrary number of keys with arbitrary names. The ordering of the resulting list is irrelevant to me.

Currently I have this:

>>> from collections import OrderedDict
>>> from itertools import product
>>> 
>>> def looper(in_dict):
>>>     order_of_keys = in_dict.keys()
>>>     list_of_tuples = [(key, in_dict[key]) for key in order_of_keys]
>>>     ordered_dict = OrderedDict(list_of_tuples)
>>>     return [dict(zip(ordered_dict.keys(), t)) for t in product(*ordered_dict.values())]
>>> 
>>> x = {"first_name": ["Habib", "Wen-lao"], "second_name": ["Khan", "Chen"]}
>>> print looper(in_dict=x)
[{'first_name': 'Habib', 'second_name': 'Khan'}, 
 {'first_name': 'Habib', 'second_name': 'Chen'}, 
 {'first_name': 'Wen-lao', 'second_name': 'Khan'}, 
 {'first_name': 'Wen-lao', 'second_name': 'Chen'}]

But it doesn't show me the combinations with the keys omitted. How can I do that?

EDIT: This question is related, but is substantially different. There I wanted to know how to create a simple combination of all lists. Here I want to know how to also include combinations with the keys omitted.

Saqib Ali
  • 11,931
  • 41
  • 133
  • 272
  • 3
    Didn't you post this question a few hours ago? FWIW, the system _will_ penalize you if you keep deleting & re-posting your questions. Did you look at the itertools powerset recipe I linked you to last time? – PM 2Ring Nov 08 '17 at 16:29
  • 1
    Possible duplicate of [How can I write a python method to compute all combinations of a dict of lists?](https://stackoverflow.com/questions/47175020/how-can-i-write-a-python-method-to-compute-all-combinations-of-a-dict-of-lists) – Ajax1234 Nov 08 '17 at 16:36
  • Suggesting looking at a library is not a solution to my question. Stack overflow told me to create a new question explaining why my question was different, so I'm doing that. – Saqib Ali Nov 08 '17 at 16:36
  • Ajax, this question is substantially different than that question. See that part I bolded. – Saqib Ali Nov 08 '17 at 16:37
  • If you don't care about the ordering, I don't understand why you are using `OrderedDict`. I admit that this task is a little tricky, it's unusual to want to put the combined first+last names with the separate first names and last names. – PM 2Ring Nov 08 '17 at 17:05
  • 1
    @PM2Ring: Not only that, but converting a dict to an orderedDict ist useless : the order has already been lost. – Eric Duminil Nov 08 '17 at 17:26

3 Answers3

2

itertools does not deal with epsilon elements in your sets; you will need to code that as a separate clause in each set. To each set, add a key element, such as epsilon: None; in your comprehension expressions, you'll need to add if clauses to ignore any such element, rather than including it in the output. Note that zip will no longer serve, because you want to generate entries of different lengths.

Another possibility is to zip the lists, including the epsilon elements, but write an expression to exclude those elements from the zipped list's members.

Can you take it from there?

Prune
  • 76,765
  • 14
  • 60
  • 81
  • Indeed. Now that I wrote an answer, I think that I understand your explanation and that your description fits my code. – Eric Duminil Nov 08 '17 at 17:30
2

We first combine each name in the value lists of your dictionary with its key, saving those results to the y list. Next we create a list z of the desired pairs using itertools.product. Then we extend z with the individual names from y. Finally we add an empty dict to z

from itertools import product

x = {"first_name": ["Habib", "Wen-lao"], "second_name": ["Khan", "Chen"]}
y = [[(k, u) for u in v] for k, v in x.items()]
z = [dict(t) for t in product(*y)]
z.extend({k: v} for u in y for k, v in u)
z.append({})

for row in z:
    print(row)

output

{'first_name': 'Habib', 'second_name': 'Khan'}
{'first_name': 'Habib', 'second_name': 'Chen'}
{'first_name': 'Wen-lao', 'second_name': 'Khan'}
{'first_name': 'Wen-lao', 'second_name': 'Chen'}
{'first_name': 'Habib'}
{'first_name': 'Wen-lao'}
{'second_name': 'Khan'}
{'second_name': 'Chen'}
{}

This code will give the correct results if the x list contains more than 2 items, and if the sublists in each x value list have more than 2 items.

PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
  • It's my turn to comment ;) Your method doesn't work anymore with 3 keys. There are 18 combinations with `{"first_name": ["Habib", "Wen-lao"], "second_name": ["Khan", "Chen"], "middle_name" : ['K.']}`, your method only outputs 10. – Eric Duminil Nov 08 '17 at 17:27
  • @EricDuminil I see what you mean. But I guess that depends on how you interpret the question. ;) – PM 2Ring Nov 08 '17 at 17:33
  • I don't think so : `I would like to see all combinations that omit the keys entirely. ` and `The dictionary x may have any arbitrary number of keys with arbitrary names.`. ;) – Eric Duminil Nov 08 '17 at 17:40
  • @EricDuminil Fair enough. But I think I'll let my answer stand, just in case. :) Besides, if I change it, it'll just be a duplicate of yours. – PM 2Ring Nov 08 '17 at 17:43
2

I tried to write as few changes as possible. Since you start with an unordered dict, converting it to an orderedDict doesn't bring anything. You can be sure that dict.keys() and dict.values() are in the same order, though.

You just need to add None to each values list and remove the pairs for which the value is None:

from itertools import product

def looper(in_dict):
    keys = in_dict.keys()
    values = [l + [None] for l in in_dict.values()]
    return [{k: v for k,v in zip(keys, t) if v} for t in product(*values)]

x = {"first_name": ["Habib", "Wen-lao"], "second_name": ["Khan", "Chen"]} 
for d in looper(x):
    print(d)

It outputs:

{'first_name': 'Habib', 'second_name': 'Khan'}
{'first_name': 'Habib', 'second_name': 'Chen'}
{'first_name': 'Habib'}
{'first_name': 'Wen-lao', 'second_name': 'Khan'}
{'first_name': 'Wen-lao', 'second_name': 'Chen'}
{'first_name': 'Wen-lao'}
{'second_name': 'Khan'}
{'second_name': 'Chen'}
{}
Eric Duminil
  • 52,989
  • 9
  • 71
  • 124