Specially i am unsure about...
The thing about updating Python dicts in comprehensions is a bit more complex because they are mutable. In Why doesn't a python dict.update() return the object? the best answer suggests your current solution. Personally I'd probably go with a regular for-loop here in order to ensure the code is legible.
Is this correct way to prepare testing data?
- Usually in unit tests you will test both for edge cases and regular cases (you don't wanna repeat yourself, though). You usually want to split the tests, so that each has its own name explaining why it's there and possibly some other data that could help some outsider understand why it's important to make sure this scenario works correctly. Putting all scenarios in one list and then running the test for each one of them without giving the reader additional context (in form of at least a test case name) makes it harder for the reader to distinguish between the cases and judge whether they are all really needed.
- Putting each of the scenarios in a separate test case may seem a bit tedious at times, but if any of the tests fails, you can immediately tell which part of the software is failing. If you feel like you write way too many unit tests, then perhaps some of them cover the same kinds of scenarios.
- When dealing with unit tests performance is rarely the top priority. Usually what counts more is making the tests number minimal, yet sufficient in order to ensure the software is working correctly. The other prioritized thing is making the tests easily understandable. See below for another take on this (not necessarily more performant yet hopefully more legible).
Alternative solution
You could use itertools.product
in order to simplify your code.
The template
parameter can be removed (since you can pass the template variable names and their possible values in **kwargs
):
from pprint import pprint
import itertools
def _f(**kwargs):
keys, values = zip(*(kwargs.items())) # 1.
subsets = [subset for subset in itertools.product(*values)] # 2.
return [
{key: value for key, value in zip(keys, subset)} for subset in subsets
] # 3.
r = _f(a=[1, 2], b=[11, 22], x=['asdf'])
pprint(r)
Now what's happening in each of these steps:
Step 1.
You split the keyword dict into keys and values. It's important, so that you will fix the order of how you iterate through these arguments every time. The keys and values look like this at this point:
keys = ('a', 'b', 'x')
values = ([1, 2], [11, 22], ['asdf'])
Step 2. You compute the cartesian product of the values, which means you get all the possible combinations of taking a value from each of the values
lists. The result of this operation is as follows:
subsets = [(1, 11, 'asdf'), (1, 22, 'asdf'), (2, 11, 'asdf'), (2, 22, 'asdf')]
Step 3.
Now you need to map each of keys to their corresponding values in each of the subsets, hence the list and dict comprehensions, the result should be exactly what you computed using your previous method:
[{'a': 1, 'b': 11, 'x': 'asdf'},
{'a': 1, 'b': 22, 'x': 'asdf'},
{'a': 2, 'b': 11, 'x': 'asdf'},
{'a': 2, 'b': 22, 'x': 'asdf'}]