1

I'm trying to generate all multiple cartesian products of a set S with itself of a fixed total target string length n, with the set containing strings of different lengths.

Eg. S = ['1', '22', '333'] with target length n=4 should output

['1111', '1122', '1221', '1333', '2211', '2222', '3331']

Also the algorithm needs to be somewhat efficient, because my final application will be of much bigger scale.

Currently my only solution is generating S^n, cutting the results to length n and removing duplicates, which will contain my wanted results, as well as many additional entries with the last string truncated. This algorithm doesn't scale up well, and gets unusable quickly.

I'm currently working in python, but I'm thankful for any help in pseudo code you can offer!

  • What is the cartesian product of one set? what is the other set? – ymmx Jun 03 '22 at 13:01
  • I mean the cartesian product of the set with itself, multiple times. So any element of the set, followed by any element of the set (which can also be the same one multiple times) – graustufenwinfried Jun 03 '22 at 13:05

1 Answers1

1

I suggest to use itertools to perform the cartesian product

import itertools

S = ['1', '22', '333']
R=[]
for element in itertools.product(S,S):
    print(element)
    concat = element[0]+element[1]
    if len(concat) == 4:
        R.append(concat)
print(R)

But I'm not sure the output is what you wanted because the '1111' is an expected output but it is not one of the cartesian product output

If is it possible to have 4 '1', you should probably perform the Cartesian production with 4 time the same set with an empty string in it. and remove duplicate

import itertools

import itertools

S = ['','1', '22', '333']
R=[]
for element in itertools.product(S,S,S,S):
    concat = ''.join(element)
    print(element,concat)
    if len(concat) == 4:
        R.append(concat)
print(R)
R2 = list(set(R)) #remove duplicate
print(R2)
ymmx
  • 4,769
  • 5
  • 32
  • 64
  • Thanks for your answer, Ive actually been using itertools.product, but the results do indeed differ from the wanted output. In this case the problem is that I dont know the number of iterations needed, as it does vary due to the different lengths of strings. So in some cases I'd need 4 products ('1111'), in others 2, as realized in your example – graustufenwinfried Jun 03 '22 at 13:14
  • The revised code has the wanted functionality, thanks! Kudos for the idea with the empty string. I will try if it's working efficiently enough on my project. Also, as a side note, you can use itertools.product(S, repeat=n) for larger quantities. Thanks for your help! – graustufenwinfried Jun 03 '22 at 13:20