-2

I have a class whose __iter__() method return a itertools.product() of a dynamically generated data. This data do a Cartesian product of arbitrarily nested dicts, and now I need to flatten it, but in a streamlined way, aggregating intermediate iterators.

I'm trying either of these:

  1. Modify __iter__() to handle the internal tuples:

     class Explosion:
         ...
         def __iter__(self):
             return product(*self.fragments)
    
  2. Encapsulate it in another object to handle the conversion, but this is less desirable:

     class CleanOutput:
         def __init__(self, it):
             self.it = it
    
         def next(self):
             for x in self.it:
                 yield ?
    
     class Explosion:
         ...
         def __iter__(self):
             return CleanOutput(product(*self.fragments))
    

Well, the algorithm does work, but the problem is the unpacking at the final, for example:

(11, ..., (10.7, 104.75, ('N', True, False, 'B2B'), 99.01, ...), 1, 'SP', 7).

Look at all the nesting! How to remove it in real-time? While it is being generated... I'm looking for a way retrieve:

(11, ..., 10.7, 104.75, 'N', True, False, 'B2B', 99.01, ..., 1, 'SP', 7).

What is the best and fastest way to do it? Thank you!

EDIT

Actually, what I'd really like was a list comprehension or a generator expression or even another generator, because I needed to include it in a callable, intercepting the output of the itertools.product() itself. I don't simply need a way to clean these tuples. So it isn't a duplicate.

rsalmei
  • 3,085
  • 1
  • 14
  • 15
  • Can you show your code that how you achieved to the first result? – Mazdak Apr 24 '16 at 20:18
  • Sorry @Kasramvd, I can't. But it isn't that complex, it is a simple _flatten dict_ algorithm, with a few twists, like detecting when you have a sequence in hand, get and try one element of it and, if it is primitives, store them to be used in itertools.product, if it is a nested dict then it is harder, as you have to create another explosion object, like the root one, for each of the entries. Then, you combine these products (the nested tuples) by concatenating the lists, and apply the outer product with this. – rsalmei Apr 24 '16 at 20:50
  • To be honest, I don't understand what's stopping you from using the highest voted answer from the linked page. `product` returns an iterator, which you may send to the function and return the resulting generator like `return flatten(product(...))` – vaultah Apr 24 '16 at 21:14
  • Thanks @vaultah, I understand, but I did try that function, and besides it uses recursion, which I do not want here, it yields ints and other primitives, and not tuples like the original structure. And I need to hook and process that iterator, because I'd like the client to receive the same tuples, which represent the exploded data, not a stream of their contents, which have several lines. Anyway, I've used that one and it doesn't work, all I got were dozens of separate ints or strings, without the rows. I can't rely on the client creating tuples or lists again. – rsalmei Apr 24 '16 at 21:33
  • Yes, maybe I can't, but I've wished to try. I've already abused recursion with the explosion engine, so I'd like to avoid now, as I'm worried with performance and spikes of memory usage. Thank you @Natecat. – rsalmei Apr 24 '16 at 21:49
  • Please @vaultah, check my last answer. See how it was much more complex than initially anticipated... – rsalmei Apr 26 '16 at 05:46

2 Answers2

1
def gen(data):
   for item in data:
      if isinstance(item, tuple):
          for nested in gen(item):
              yield nested
      else:
          yield item

Untested, but should work.

Amit Gold
  • 727
  • 7
  • 22
  • Nope, but thank you. Remember that itertools.product() returns tuples as the different combinations, and I DO WANT these combinations, only without the inner tuples. With this, all I got were dozens of separate ints or strings, without the boundaries of the combinations. – rsalmei Apr 24 '16 at 21:42
  • @rsalmei So you are saying you don't want a generator? – Natecat Apr 24 '16 at 21:43
  • No, I do want a generator! That's what I've asked for: list comprehension or a generator expression... I only need to remove the inner tuples, not the combinations. – rsalmei Apr 24 '16 at 21:53
  • @rsalmei What? There's no product in there... It's pretty much recursion. – Amit Gold Apr 25 '16 at 07:27
1

That wasn't easy, the recursion has to be used, but separated from the main __iter__ method. That's how I ended up doing. Now also with a recursive generator _merge, called by another generator _flatten:

class Explosion:
    # ...

    def __iter__(self):
        def _flatten(container):
            def _merge(t):
                for te in t:
                    if isinstance(te, tuple):
                        for ite in _merge(te):
                            yield ite
                    else:
                        yield te

            for t in container:
                yield tuple(_merge(t))

        return _flatten(product(*self.fragments))

See an example of utilization of the _flatten() function:

>>> list(itertools.product([1,2],[3,(4,(5,6))]))
[(1, 3), (1, (4, (5, 6))), (2, 3), (2, (4, (5, 6)))]
>>> list(_flatten(itertools.product([1,2],[3,(4,(5,6))])))
[(1, 3), (1, 4, 5, 6), (2, 3), (2, 4, 5, 6)]
rsalmei
  • 3,085
  • 1
  • 14
  • 15
  • 1
    If you want, you could also nest the `_flatten` helper method *inside* your `__iter__` method in the case you don't need it anywhere else. – Byte Commander Apr 25 '16 at 05:53
  • @ByteCommander, I've tried to make the `merge` helper method a second generator, inside the _flatten generator, but it doesn't work, the external tuples (the ones I want to maintain) came out empty. Do you know why? I'd rather not use a temp sequence to move tons of data, but... – rsalmei Apr 26 '16 at 04:52
  • I can't comment on that without seeing your code. Ask a new question and give me a link to it in a comment here. – Byte Commander Apr 26 '16 at 05:31
  • I did it!!! The problem was a recursive generator, that needs an enclosing for loop! See my updated answer @ByteCommander... – rsalmei Apr 26 '16 at 05:41