1

I want to put several data files through two modules to process them, using every combination of several settings each on several parameters for each module. The obvious way to do this is with a nested for loop, but by the time you get to 7+ nested for loops, no. I want to make this more elegant than that.

I've already read several very similar questions, but while this one reveals that I probably want to use itertools, it only iterates through number sequences, while I want to iterate through lists of strings that are contained as values within dictionaries; this other one reveals that what I want is called a Cartesian product, but not how to make that out of dictionary values; and while this one combines dictionaries of lists in a Cartesian product, I want the output to be a list of lists as in the previous linked question, not a list of dictionaries.

In:

video = ["It's Friday.mp4",'Hot Koolaid.mov','The Water Buffalo Song.mp4']
CC = {'size':['6','10','14'],'font':['Courier New'],'color':['black','white'],'language':['English']}
Noise = {'CRT':['speckles','rising stripes','no signal'],'sound':['white','crackle']}

Out:

[['It's Friday.mp4','6','Courier New','black','English','speckles','white'], 
 ['Hot Koolaid.mov','6','Courier New','black','English','speckles','white']
 ...
 ['The Water Buffalo Song.mp4','14','Courier New','white','English','no signal','crackle']]

I'm pretty sure I want to use itertools, and that what I want to make is a Cartesian product of lists. I think the hardest thing at the moment is to draw those lists out of the dictionaries and put the combinations of their elements into lists.

_________Edited:____________

In the process of checking out the answer I then accepted, I found that it's important (for my purposes here) for all parameters to be in lists, even if there's only one value being considered; a string without square brackets will be iterated over one character at a time.

The ugly nested for loop looks like:

for vid in video:
    for siz in CC['size']:
        for fon in CC['font']:
            for col in CC['color']:
                for lan in CC['language']:
                    for crt in Noise['CRT']:
                        for sou in Noise['sound']:
                            some_function(vid,siz,fon,col,lan,crt,sou)
Post169
  • 668
  • 8
  • 26
  • With your example input how do you know ['black', 'white'] means you have two 'black' items and one 'white'? – CMMCD Aug 12 '19 at 20:00
  • @CMMCD It doesn't mean two black and one white; there are many, many lists represented by the ellipsis. The Cartesian product might also be called the factorial; we're making a list of every possible combination of the open options. – Post169 Aug 12 '19 at 21:19

2 Answers2

5

The dictionaries are small enough that it's simplest to just hard-code the seven arguments to itertools.product: one "independent" list, four lists from CC, and two lists from Noise.

from itertools import product
result = list(product(
                  video,
                  CC['size'],
                  CC['font'],
                  CC['color'],
                  CC['language'],
                  Noise['CRT'],
                  Noise['sound']
         ))

You can simplify it a little using operator.itemgetter, which eliminates repeated mentions of the two dictionaries.

from operator import itemgetter
result = list(product(
                  video,
                  *itemgetter('size', 'font', 'color', 'language')(CC),
                  *itemgetter('CRT', 'sound')(Noise)
         ))

You can shorten it further if you are certain about the order in which the values of the dictionaries will be produced:

result = list(product(video, *CC.values(), *Noise.values()))
chepner
  • 497,756
  • 71
  • 530
  • 681
  • 2
    last option is a reason why to love python – Olvin Roght Aug 12 '19 at 20:06
  • @chepner You say use of the third, most succinct option is contingent on being certain of the order that the dictionaries will give me their values in. I have read in the past that there is no certainty of the order, yet whenever I print a dictionary (or part of one) I always/almost always see its elements in the order I put them in. Do you know what that is contingent on? – Post169 Aug 13 '19 at 14:02
  • 2
    Prior to Python 3.6, the iteration order was arbitrary. In Python 3.6, the order is determined by the order in which keys are added to the `dict`, but only as an implementation detail in CPython. That implementation detail was made a language requirement in Python 3.7. – chepner Aug 13 '19 at 14:05
  • So, basically, because I'm using Python 3.7, the elegant form is an option for me? – Post169 Aug 13 '19 at 14:07
  • 1
    Assuming you know the order in which the keys were added and that order is the one you want, yes. – chepner Aug 13 '19 at 14:07
1

If you can do some mild adjusting to your data to make it such that each key either has a single value (like {'language': 'English'}), or a list of values corresponding to each video (like {'color':['black','black','white']}), you can also make a nice table of data using pandas. An example woul be something like:

video = ["It's Friday.mp4",'Hot Koolaid.mov','The Water Buffalo Song.mp4']
CC = {'size':['6','10','14'],'font':'Courier New','color':['black','black','white'],'language':'English'}
Noise = {'CRT':['speckles','rising stripes','no signal'],'sound':['white','white','crackle']}

video_df = pd.DataFrame()
video_df['video'] = video
for key in CC.keys():
    video_df[key] = CC[key]
for key in Noise.keys():
    video_df[key] = Noise[key]

video_df.values.tolist()

This would yield a list of lists that looks like:

[["It's Friday.mp4",
  '6',
  'Courier New',
  'black',
  'English',
  'speckles',
  'white'],
 ['Hot Koolaid.mov',
  '10',
  'Courier New',
  'black',
  'English',
  'rising stripes',
  'white'],
 ['The Water Buffalo Song.mp4',
  '14',
  'Courier New',
  'white',
  'English',
  'no signal',
  'crackle']]
Z. Shaffer
  • 23
  • 5