-1

I am processing the Cartesian product of a list of entities.

For e.g. a_list = ['a','b']

The expected output is:

"a";"a"&"a";"b"&"b";"a"&"b";"b"

Each entity in the entity pair is separated by semicolon ";" and each enitity pair is separated by "&".

I used following nested for loop to achieve the output.

entity_set = ['a','b']

domain_text = ''
count = 0
for entity1 in entity_set:
    for entity2 in entity_set:
        count += 1
        domain_text += '"' + entity1 + '"' + ';' + '"' + entity2 + '"'
        if count < (len(entity_set)*len(entity_set)):
            domain_text += '&'
print domain_text

However, the process gets too slow as the size of a_list increases to thousands of entities.

Is there any elegant solutions that can be used alternatively?

Anish
  • 1,920
  • 11
  • 28
  • 48
  • 1
    You've basically reimplemented `itertools.product(entity_set, repeat=2)`. You'll get slight performance gains by using f-strings but it's still going to be O(n**2). – wim Apr 15 '18 at 22:17
  • this is very close to https://stackoverflow.com/questions/533905/get-the-cartesian-product-of-a-series-of-lists. Use itertools.product to generate lists, then add a step of doing a join of each list w/ its symbols. Should be flagged as duplicate? – Pavel Savine Apr 15 '18 at 22:17

1 Answers1

3

Sure. itertools.product() can do the product for you, and then a string join operation can paste all the pieces together efficiently (which is more likely than not the real source of the sloth: incrementally building the result string one little piece at a time).

from itertools import product
entity_set = ['a', 'b']
result = "&".join('"%s";"%s"' % pair
                  for pair in product(entity_set, repeat=2))

Then result prints as

"a";"a"&"a";"b"&"b";"a"&"b";"b"
Tim Peters
  • 67,464
  • 13
  • 126
  • 132