1

I am creating a class that inherits from collections.UserList that has some functionality very similar to NumPy's ndarray (just for exercise purposes). I've run into a bit of a roadblock regarding recursive functions involving the modification of class attributes:

Let's take the flatten method, for example:

class Array(UserList):
    def __init__(self, initlist):
        self.data = initlist

    def flatten(self):
        # recursive function
        ...

Above, you can see that there is a singular parameter in the flatten method, being the required self parameter. Ideally, a recursive function should take a parameter which is passed recursively through the function. So, for example, it might take a lst parameter, making the signature:

Array.flatten(self, lst)

This solves the problem of having to set lst to self.data, which consequently will not work recursively, because self.data won't be changed. However, having that parameter in the function is going to be ugly in use and hinder the user experience of an end user who may be using the function.

So, this is the solution I've come up with:

def flatten(self):
    self.data = self.__flatten(self.data)

def __flatten(self, lst):
    ...
    return result

Another solution could be to nest __flatten in flatten, like so:

def flatten(self):
    def __flatten(lst):
        ...
        return result
    self.data = __flatten(self.data)

However, I'm not sure if nesting would be the most readable as flatten is not the only recursive function in my class, so it could get messy pretty quickly.

Does anyone have any other suggestions? I'd love to know your thoughts, thank you!

gmdev
  • 2,725
  • 2
  • 13
  • 28
  • If `flatten` calls `x.flatten()` for some `x` of the same class, then it is recursive. There is no need to add an extra parameter if the method doesn't logically have one from the caller's perspective. – kaya3 Feb 24 '21 at 13:28
  • @kaya3 yes, but if there isn't a parameter to begin with, I'd have to use `self.data` within that recursive function, which wouldn't work because the function would not have a new list to work with. Am I correct or just confused? Lol. I'm open to any suggestions :) – gmdev Feb 24 '21 at 13:36
  • @gmdev What type of data is `self.data`? Why don't you just flatten that data? Can you show an example? – a_guest Feb 24 '21 at 13:44
  • @a_guest It is just an n-dimensional uniform list (not ragged). So, something like `[[1,2,3],[4,5,6]]`. – gmdev Feb 24 '21 at 13:47
  • @gmdev So what exactly is recursive about that? It seems you just need a helper function to flatten a nested list. You can add this as a `staticmethod` for example (it seems like that's what your `__flatten` function is doing already). – a_guest Feb 24 '21 at 13:49
  • @a_guest So I'd need to recurse through it because if the shape is something like `(3, 4, 7, 2, 8)`, I can't do that with a simple list comprehension. Furthermore, the `flatten` method was just an example, I have other methods that require recursion as well. Does that make sense? – gmdev Feb 24 '21 at 13:52
  • @gmdev If you recursively call `x.flatten()` on a different object, then `self.data` refers to a *different* `self` within the recursive call; it will refer to the same object as `x` does. You should think of `x.flatten()` as being just like `Array.flatten(x)`, and those are actually equivalent. – kaya3 Feb 24 '21 at 13:55

4 Answers4

1

A recursive method need not take any extra parameters that are logically unnecessary for the method to work from the caller's perspective; the self parameter is enough for recursion on a "child" element to work, because when you call the method on the child, the child is bound to self in the recursive call. Here is an example:

from itertools import chain

class MyArray:
    def __init__(self, data):
        self.data = [
            MyArray(x) if isinstance(x, list) else x
            for x in data]
    def flatten(self):
        return chain.from_iterable(
            x.flatten() if isinstance(x, MyArray) else (x,)
            for x in self.data)

Usage:

>>> a = MyArray([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
>>> list(a.flatten())
[1, 2, 3, 4, 5, 6, 7, 8]
kaya3
  • 47,440
  • 4
  • 68
  • 97
  • Are there any inherent advantages in using this over a simple helper method? – gmdev Feb 24 '21 at 16:25
  • @gmdev What is the advantage of using a helper method? You could ask why any method doesn't call an extra helper method to do its actual work. The point is that the assumption in the question (that the method needs an extra parameter in order to be recursive) is a misconception. – kaya3 Feb 24 '21 at 18:41
1

Since UserList is an iterable, you can use a helper function to flatten nested iterables, which can deal likewise with lists and Array objects:

from collections import UserList
from collections.abc import Iterable


def flatten_iterable(iterable):
    for item in iterable:
        if isinstance(item, Iterable):
            yield from flatten_iterable(item)
        else:
            yield item


class Array(UserList):
    def __init__(self, initlist):
        self.data = initlist

    def flatten(self):
        self.data = list(flatten_iterable(self.data))


a = Array([[1, 2], [3, 4]])
a.flatten(); print(a)  # prints [1, 2, 3, 4]

b = Array([Array([1, 2]), Array([3, 4])])
b.flatten(); print(b)  # prints [1, 2, 3, 4]
a_guest
  • 34,165
  • 12
  • 64
  • 118
  • This is exactly the answer I came here to write. Do as a_guest do. Don't tangle generic functions up in your class methods. As demonstrated here, write a separate plain function and use a simple wrapper in your object interface. This gives the understated benefit of allowing the function to be used in a functional _and_ an object-oriented way. To see this technique in other contexts, please see [this recent Q&A](https://stackoverflow.com/a/66340109/633183). – Mulan Feb 24 '21 at 15:51
  • @Thankyou but these functions won't be contained within the class, I'll have to define them outside of the class? – gmdev Feb 24 '21 at 16:41
  • 1
    @gmdev exactly. The class is only a simple wrapper around plain functions. With this one practice, you are 1) unraveling complexity, 2) making it easier to write and debug your program, 3) promoting code reuse through flexible and generic functions. You can and should define the functions and the class wrapper in its own module (file). This abstraction barrier gives you the added benefits of expanding and modifying the module's capabilities without risking breaking other parts of your program. This is explained in detail in the link provided. – Mulan Feb 24 '21 at 17:14
  • @Thankyou Gotcha, thank you for the explanation. While I do understand the reasoning for something like this, could you elaborate *why* this might be a better approach than the other answer? I'm not challenging you, just wanting to fully understand the use of one over the other. I really appreciate your time and effort – gmdev Feb 24 '21 at 18:25
  • 1
    @gmdev this is an open forum, we should feel free to discuss and challenge one another :D the points I gave above are the reasons why I think the approach is better. The answer from Kaya doesn't make sense, imo. First, it is reshaping the data upon construction which is not necessary; your custom type can store internal state in any shape you want and the choice is unexplained and unclear to me. Second, the `flatten` method *could* be generic but is instead tangled up with this specific class, so any time we need similar functionality in another part of the program, it will be duplicated. – Mulan Feb 24 '21 at 21:49
  • 1
    (continued) Generic functions that can be reused in many ways are always better than overly specialized ones that offer fewer usage scenarios. Take python's [built-in functions](https://docs.python.org/3/library/functions.html) as a great example of this. Finally, the use of `chain.from_iterable` is unnecessary and I don't know why it was chosen other than it is _a_ way which works. I'll post an answer with details after dinner. – Mulan Feb 24 '21 at 21:53
0

barrier of abstraction

Write array as a separate module. flatten can be generic like the example implementation here. This differs from a_guest's answer in that only lists are flattened, not all iterables. This is a choice you get to make as the module author -

# array.py

from collections import UserList

def flatten(t):                     # generic function
  if isinstance(t, list):
    for v in t:
      yield from flatten(v)
  else:
    yield t

class array(UserList):
  def flatten(self):
    return list(flatten(self.data)) # specialization of generic function

why modules are important

Don't forget you are the module user too! You get to reap the benefits from both sides of the abstraction barrier created by the module -

  1. As the author, you can easily expand, modify, and test your module without worrying about breaking other parts of your program
  2. As the user, you can rely on the module's features without having to think about how the module is written or what the underlying data structures might be
# main.py

from array import array

t = array([1,[2,3],4,[5,[6,[7]]]])  # <- what is "array"?

print(t.flatten())
[1, 2, 3, 4, 5, 6, 7]

As the user, we don't have to answer "what is array?" anymore than you have to answer "what is dict?" or "what is iter?" We use these features without having to understand their implementation details. Their internals may change over time, but if the interface stays the same, our programs will continue to work without requiring change.

reusability

Good programs are reusable in many ways. See python's built-in functions for proof of this, or see the the guiding principles of the Unix philosophy -

  1. Write programs that do one thing and do it well.
  2. Write programs to work together.

If you wanted to use flatten in other areas of our program, we can reuse it easily -

# otherscript.py

from array import flatten

result = flatten(something)
Mulan
  • 129,518
  • 31
  • 228
  • 259
  • Awesome, this is great. So maybe I could have a separate submodule like `util` or something with these functions in it. And then in `array.py` I could just `from modulename.util import function`. Thanks! – gmdev Feb 24 '21 at 22:24
  • Now you are thinking in modules :D – Mulan Feb 24 '21 at 22:35
0

Typically, all methods of a class have at least one argument which is called self in order to be able to reference the actual object this method is called on.

If you don't need self in your function, but you still want to include it in a class, you can use @staticmethod and just include a normal function like this:

class Array(UserList):
    def __init__(self, initlist):
        self.data = initlist

    @staticmethod
    def flatten():
        # recursive function
        ...

Basically, @staticmethod allows you to make any function a method that can be called on a class or an instance of a class (object). So you can do this:

arr = Array()
arr.flatten()

as well as this:

Array.flatten()

Here is some further reference from Pyhon docs: https://docs.python.org/3/library/functions.html#staticmethod

rhymefororange
  • 341
  • 1
  • 7