0

Note: similar question here, but I don't believe it's an exact duplicate given the specifications.

Below, I have two classes, one inheriting from the other. Please note these are just illustrative in nature.

In _Pandas.array(), I want to simply wrap a pandas DataFrame around the NumPy array returned from _Numpy.array(). I'm aware of what is wrong with my current code (_Pandas.array() gets redefined, attempts to call itself, and undergoes infinite recursion), but not how to fix it without name mangling or quasi-private methods on the parent class.

import numpy as np
import pandas as pd

class _Numpy(object):
    def __init__(self, x):
        self.x = x
    def array(self):
        return np.array(self.x)

class _Pandas(_Numpy):
    def __init__(self, x):
        super(_Pandas, self).__init__(x)
    def array(self):
        return pd.DataFrame(self.array())

a = [[1, 2], [3, 4]]
_Pandas(a).array()    # Intended result - pd.DataFrame(np.array(a))
                      # Infinite recursion as method shuffles back & forth

I'm aware that I could do something like

class _Numpy(object):
    def __init__(self, x):
        self.x = x
    def _array(self):            # Changed to leading underscore
        return np.array(self.x)

class _Pandas(_Numpy):
    def __init__(self, x):
        super().__init__(x)    
    def array(self):
        return pd.DataFrame(self._array())

But this seems very suboptimal. In reality, I'm using _Numpy frequently--it's not just a generic parent class--and I'd prefer not to preface all its methods with a single underscore. How else can I go about this?

Brad Solomon
  • 38,521
  • 31
  • 149
  • 235
  • So, your `.array` method returns either an `np.ndarray` or a `pd.DataFrame`? This would break the Liskov substitution principle, no? – juanpa.arrivillaga Oct 31 '17 at 19:04
  • @juanpa.arrivillaga `_Numpy.array()` returns `np.ndarray`, `_Pandas.array()` returns `pd.DataFrame` (or at least I'd like it to) – Brad Solomon Oct 31 '17 at 19:50
  • Right, essentially, what that means is that for any property that is true of a given type should be true of any subtypes. In this case, the property that `array` returns an `ndarray` is being violated. Generally, you want methods to return the same types, or at least, return types should be covariant. – juanpa.arrivillaga Oct 31 '17 at 19:54
  • @juanpa.arrivillaga See my answer below, mainly what I'm interested in is a way to go about this without needing a private and public version of every method. But that actually seems to be a commonly used route. – Brad Solomon Oct 31 '17 at 20:03

1 Answers1

2

Uhm... just want to check why in _Pandas class you don't call super directly?

class _Pandas(_Numpy):
    def __init__(self, x):
        super(_Pandas,self).__init__(x)
    def array(self):
        return pd.DataFrame(super(_Pandas,self).array())

I tried that and got the below result, don't know if it's what you wanted or I have missed anything

a = [[1, 2], [3, 4]]
_Pandas(a).array()
  0  1
0  1  2
1  3  4
Nguyen Pham
  • 444
  • 1
  • 4
  • 14
  • 1
    Think of it like calling a normal method. Many times you may want to explicitly call your super method without doing anything in your child method. – Nguyen Pham Nov 16 '17 at 02:49