Avoid duplicate while generate list of items from a set of patterns

Question

I'm trying to create a item list with specific patterns, so far my design is like this,

patterns = [
   '{column1}{column2}',
   '{column1}@{column3}',
   '{column2}#{column4}',
   '{column3}!@#'
]

for c1 in possible_column1:
   for c2 in possible_column2:
     for c3 in possible_column3:
        for c4 in possible_column4:
          data = { 
             'column1': c1,
             'column2': c2,
             'column3': c3,
             'column4': c4,
          }
          for pattern in patterns:
              result.append(pattern.format(**data))

The design has many problems,

It create duplicate values, and I have to do a list(set(result)) to unique the list
It's slow

What's the common way of writing such algorithms?

The patterns list varies and will be changed frequently, new type of columns maybe added as well

P.S
In my opinion, this differs from permutations.

@daisy If it is not a permutation, I do not understand your question. — Ma0, Jul 13 '17 at 11:44
You can use `itertools.product()` instead of the 4 nested loops. I agree that it is not a permutation, it is a cartesian product. — Gribouillis, Jul 13 '17 at 11:44
The fact that the pattern is *funky* does not make it **not** a permutation — Ma0, Jul 13 '17 at 11:44
I wanna see if I understand, you want to always get the values that are inside `{}` no matter the separator? If that's what you're looking for it can be done with regex (`re` package), and then you can still `list(set(result))` at the end to get unique values — Ofer Sadan, Jul 13 '17 at 11:47

score 2 · Accepted Answer · answered Jul 13 '17 at 11:57

You could use another data structure (a dict) to keep the patterns:

patterns = {
   '{0}{1}':  (possible_column1, possible_column2),
   '{0}@{1}': (possible_column1, possible_column3),
   '{0}#{1}': (possible_column2, possible_column4),
   '{0}!@#':  (possible_column3, )
}

With that dictionary you can use itertools.product on the "values" of that dictionary:

from itertools import product

# Just some data for the possible columns...
possible_column1 = list('12')
possible_column2 = list('34')
possible_column3 = list('56')
possible_column4 = list('78')

result = []
for pattern, cols in patterns.items():
    for prod in product(*cols):
        result.append(pattern.format(*prod))

# or if you like it shorter:
# result = [pattern.format(*prod) for pattern, cols in patterns.items() for prod in product(*cols)]

That way you don't create the duplicated entries.

Avoid duplicate while generate list of items from a set of patterns

1 Answers1