-2

I have looked around in other posts and haven't been able to come to a solution for this. I have a function that needs to be recursively called using an itertools expression in order to return a tuple that has unique elements with it's order preserved.

for instance:

def function(list):
    return list and (list[0],) + function(some_itertools_expression)

given example: function([1, 7, 7, 9, 0, 1]) should return (1, 7, 9, 0)

I've tried using:

return list and (list[0],) + function(tuple([itertools.groupby(list)][:len(list)]))

but I end up running into RecursionError: maximum recursion depth exceeded. How can I solve this without getting the max recursion depth error?

Mechanic Pig
  • 6,756
  • 3
  • 10
  • 31
D-Money
  • 137
  • 1
  • 12
  • I don't understand why recursion is supposed to be useful to solve the problem. Just create the result directly from the `itertools.groupby` results, by *iterating over* them. – Karl Knechtel Oct 05 '22 at 02:18
  • @KarlKnechtel groupby is not even particularly useful here, because you still have to maintain a "seen" collection (the data is not ordered) – wim Oct 05 '22 at 02:22

2 Answers2

1

You can do this fairly easily, without needing recursion, by making a tuple via dictionary keys. The dict must have unique keys, and will preserve the order of the original input sequence.

>>> data = [1, 7, 7, 9, 0, 1]
>>> (*{}.fromkeys(data),)
(1, 7, 9, 0)
wim
  • 338,267
  • 99
  • 616
  • 750
  • Thanks for your input, that is understandable but the requirements are that it is recursive and makes use of only the itertools module – D-Money Oct 05 '22 at 01:45
  • Why is that a requirement? This is really not a good use-case for recursion. – wim Oct 05 '22 at 01:46
  • the goal is just to practice with both of these concepts. I already understand how this could easily be solved by other means. – D-Money Oct 05 '22 at 01:49
  • `dict.fromkeys` is a classmethod. It is not necessary to create an empty dictionary here. – Mechanic Pig Oct 05 '22 at 01:57
  • 1
    @MechanicPig `{}.fromkeys(...)` is actually faster than `dict.fromkeys(...)` because it doesn't need to check local/enclosing/global/built-in scopes for the "dict" name. And it's faster to type too. – wim Oct 05 '22 at 02:06
  • 1
    @wim OK, I have no objection to these, but I am afraid that this will mislead OP, because it is very similar to the use of `''.join(...)`, and it is easy to mistake empty dictionary for special meanings here. – Mechanic Pig Oct 05 '22 at 15:28
  • @MechanicPig I see no problem using classmethod from an instance. That's intentional for classmethods, if it was to be prohibited then it would be a staticmethod instead. There is no special meaning in `''.join(...)` either, it could be `str.join('', ....)` just the same – wim Oct 05 '22 at 17:17
  • @wim Please correct me if I'm wrong: In `dict.fromkeys`, Python doesn't look at local or enclosing namespace. For instance if it was a local variable, then it would already marked as local variable in the "creation" phase of a function(LOAD_FAST). So I think only global and built-in namespace is checked. For the next part, `fromkeys` is a key in class's namespace, so just one stop. But for instances it has to check both instance's namespace and class's namespace although I guess it's the behavior of the user defined classes not built-in. – S.B Oct 10 '22 at 20:37
  • 1
    @S.B Hmm, whether it uses LOAD_FAST or LOAD_DEREF (closure) or LOAD_GLOBAL depends on the scope. This part just concerns the lookup of the `dict` name, and for the dict literal "instance" `{}` there is no name lookup at all. It does appear as though the subsequent lookup of `fromkeys` method is faster on `dict` than on `{}`, but timing of the entire thing reliably shows that using `{}.fromkeys(...)` is the winner in total. Of course this is absolutely a micro-optimization that nobody should care about except for curiosity's sake :) – wim Oct 10 '22 at 20:55
  • @wim totally agree, just asked out of curiosity. – S.B Oct 10 '22 at 20:58
1

If you must use a function from itertools in a recursive call, I would grab the first item of the sequence in each recursion and use itertools.filterfalse to filter items equal to the first from the sequence returned by a recursive call with the rest of the items:

from itertools import filterfalse

def unique(lst):
    if not lst:
        return ()
    first, *rest = lst
    return first, *filterfalse(lambda i: i == first, unique(rest))

print(unique([1, 7, 7, 9, 0, 1]))

This outputs:

(1, 7, 9, 0)

Demo: https://replit.com/@blhsing/WelloffPlainAutomaticparallelization

blhsing
  • 91,368
  • 6
  • 71
  • 106
  • 2
    It is not correct to use `first.__eq__` directly like that. For example `unique([0, "1", 2])` will return the incorrect result of `[0]`. – wim Oct 05 '22 at 02:19