2

I am a research scientist writing a custom class I'm calling MyList() in Python 3.7, intended to add some additional methods for use on list type objects. I want these methods to be able to modify the list object in place, without having to redefine it or assign it to a new object name. For example, let's say I have already declared foo = MyList() and populated it with some arbitrary data. My goal is to be able to do something like this:

>>> foo
[1, 1, '', 3, [2, 4, '', 4, 5], [], [6] ]

>>> type(foo)
__main__.MyList

>>> foo.flatten_the_list()

>>> foo.remove_empty_items_from_list()

>>> foo.remove_duplicate_items_from_list()

>>> foo.convert_to_list_items_to_strings()

>>> foo
['1', '2', '3', '4', '5', '6']

I have posted my code below, and so far some of the methods are working correctly. I've lumped in removing empty items from the list object, removing whitespace from items in the list object, and removing duplicate items from the list object into one method called MyList.cleanup() that works well.

However, the same list comprehension that works just fine for MyList.cleanup(), doesn't work for flattening the list, nor does it work for converting the type of each item in the list from integers to strings. My code so far:

class MyList(list):

    # No def __init__() statement needed - inherits from list object

    # MyList.convert_v2() WORKS without list comprehension
    def convert_v2(self, dtype):
        for i in range(len(self)):
            self[i] = str(self[i])
        return self

    # MyList.convert_v1() DOESN'T WORK with list comprehension
    def convert_v1(self, dtype):
        self = [str(item) for item in self]
        return self

    # MyList.cleanup() WORKS with list comprehension
    def cleanup(self):
        # Remove empty items
        self = [item for item in self if "" is not item]
        # Remove duplicate items
        self = list(dict.fromkeys(self))
        # Remove whitespace (including \t and \n)
        self = ["".join(str(item).split()) for item in self]
        return self

    # MyList.flatten() DOESN'T WORK with list comprehension
    def flatten(self):
        self = [item for sublist in self for item in sublist]
        return self

This is what I get when I use the MyList.convert_v1() method (my first attempt at a method that converts the contents of a list to strings):

>>> bar
[1, 2, 3, 4, 5, 6]

>>> type(bar[0])
int

>>> bar.convert_v1()
['1', '2', '3', '4', '5', '6']

>>> bar
[1, 2, 3, 4, 5, 6]

>>> type(bar[0])
int

However, I had to stop using list comprehension to get the desired effect with MyList.convert_v2():

>>> bar
[1, 2, 3, 4, 5, 6]

>>> type(bar[0])
int

>>> bar.convert_v2()
['1', '2', '3', '4', '5', '6']

>>> bar
['1', '2', '3', '4', '5', '6']

>>> type(bar[0])
str

Why does MyList.convert_v2() work as expected, when MyList.convert_v1() does not? Outside of a class, I wouldn't expect either function to behave differently, but inside the class they do behave differently.

On a similar note, this is what I get for the MyList.flatten() method:

>>> baz
[[1, 2, 3, 4], [5, 6, 7, 8]]

>>> baz.flatten()
[1, 2, 3, 4, 5, 6, 7, 8]

>>> baz
[[1, 2, 3, 4], [5, 6, 7, 8]]

While the desired outcome is printed to the output as shown, the list object baz isn't actually flattened. The list remains unchanged after the method is called. I need it to do this instead:

>>> baz
[[1, 2, 3, 4], [5, 6, 7, 8]]

>>> baz.flatten()
[1, 2, 3, 4, 5, 6, 7, 8]

>>> baz
[1, 2, 3, 4, 5, 6, 7, 8]

Why does list comprehension work just fine in the MyList.cleanup() method, but not in the MyList.convert() or MyList.flatten() methods? I recognize that I am new to OOP and writing classes in general, so if I'm completely off base here, I look forward to learning what I could be doing differently.

Evan
  • 35
  • 5
  • 1
    Do you want to modify the existing list or do you want to make a new list? Because your functions do both of these things at different times. And if you want to make a new list, do you want to ensure it's another `MyList`, or is a regular `list` fine? – Silvio Mayolo Aug 06 '21 at 23:41
  • @SilvioMayolo I want to modify the existing list, and I want to ensure that it remains MyList, so that I can use these and any additional methods on it moving forward. – Evan Aug 06 '21 at 23:44
  • 4
    `self` is just a local variable within your methods. Assigning a new value to it is *utterly pointless*, this has no effect outside of the method. To change the underlying list value of your instances, you have to assign to an element or slice of `self`; for example, `self[:] = [...whatever...]` would completely replace the existing contents. – jasonharper Aug 06 '21 at 23:55
  • 2
    Your flatten method needs to be recursive if you want it to handle arbitrary nesting depths. As it is now, it will only hand lists whose members are *all* flat sublists. It will fail on `[1, 2]`, or `[1, [2, 3]]`, or `[[[1, 2], [3, 4]], [4, 5]]`. – Tom Karzes Aug 07 '21 at 00:18
  • 3
    I suggest write your class to be a wrapper around a list as opposed to inheriting from one. If you want it to have most of the same features you can delegate to \_\_getattr\_\_. As jasonharper said when you use self = its nothing but a very confusing variable name. – pcauthorn Aug 07 '21 at 00:30

1 Answers1

2

As per @jasonharper's comment, using self = does not accomplish anything so to override the whole list, just use self[:].

For flattening a list, this answer uses a generator with recursion so that a list of unknown depth can still be flattened.


Combining both of the above to accomplish what you wanted in your first example could look something like this:

from collections.abc import Iterable
from re import sub


class MyList(list):
    def _flattener(self, _list):
        """Generator used to flatten lists of undefined depth."""
        for el in _list:
            if isinstance(el, Iterable) and not isinstance(el, (str, bytes)):
                yield from self._flattener(el)
            else:
                yield el

    def flatten(self):
        self[:] = self._flattener(self)
        return self

    def remove_empty(self):
        self[:] = [item for item in self if item or item == 0]
        return self

    def convert_to_strings(self):
        self[:] = list(map(str, self))
        return self

    def remove_duplicates(self):
        self[:] = list(dict.fromkeys(self))
        return self

    def remove_whitespace(self):
        self.convert_to_strings()
        self[:] = [sub(r"\s+", "", item) for item in self]
        return self

Output

>>> bar = MyList()
>>> bar.extend([0, 1, 1, "", 3, [2, 4, "", 4, 5], [], None, {}, [6], "Hello\n World"])

>>> bar.flatten()
>>> bar.remove_empty()
>>> bar.convert_to_strings()
>>> bar.remove_duplicates()
>>> bar.remove_whitespace()
>>> bar
['0', '1', '3', '2', '4', '5', '6', 'HelloWorld']
Rolv Apneseth
  • 2,078
  • 2
  • 7
  • 19