6

[Python 3]

I like ndarray but I find it annoying to use.

Here's one problem I face. I want to write class Array that will inherit much of the functionality of ndarray, but has only one way to be instantiated: as a zero-filled array of a certain size. I was hoping to write:

class Array(numpy.ndarray):
  def __init__(size):
    # What do here?

I'd like to call super().__init__ with some parameters to create a zero-filled array, but it won't work since ndarray uses a global function numpy.zeros (rather than a constructor) to create a zero-filled array.

Questions:

  1. Why does ndarray use global (module) functions instead of constructors in many cases? It is a big annoyance if I'm trying to reuse them in an object-oriented setting.

  2. What's the best way to define class Array that I need? Should I just manually populate ndarray with zeroes, or is there any way to reuse the zeros function?

Philipp
  • 48,066
  • 12
  • 84
  • 109
max
  • 49,282
  • 56
  • 208
  • 355
  • 2
    I don't see why you want to create your own class at all. What is the advantage over the `numpy.zeros()` factory function? If you don't like the name, just rename it, like `create_array = numpy.zeros`. – Sven Marnach Jan 21 '11 at 15:17
  • I am no longer sure there is an advantage. Maybe I am just not used to factory function. I'll have to think about it. – max Jan 21 '11 at 19:54
  • @Sven Marnach: I found several links on this general topic: http://stackoverflow.com/questions/628950/constructors-vs-factory-methods http://stackoverflow.com/questions/2959871/factory-vs-instance-constructors http://stackoverflow.com/questions/4617311/creation-of-objects-constructors-or-static-factory-methods Nothing specific to Python, but general comments from other languages seem to apply. And as far as I can tell, there's really no disadvantage to factory functions (other than my personal preference). – max Jan 21 '11 at 20:05

3 Answers3

8

Why does ndarray use global (module) functions instead of constructors in many cases?

  1. To be compatible/similar to Matlab, where functions like zeros or ones originally came from.
  2. Global factory functions are quick to write and easy to understand. What should the semantics of a constructor be, e.g. how would you express a simple zeros or empty or ones with one single constructor? In fact, such factory functions are quite common, also in other programming languages.

What's the best way to define class Array that I need?

import numpy

class Array(numpy.ndarray):
    def __new__(cls, size):
        result = numpy.ndarray.__new__(Array, size)
        result.fill(0)
        return result
arr = Array(5)

def test(a):
    print type(a), a

test(arr)
test(arr[2:4])
test(arr.view(int))

arr[2:4] = 5.5

test(arr)
test(arr[2:4])
test(arr.view(int))

Note that this is Python 2, but it would require only small modifications to work with Python 3.

Philipp
  • 48,066
  • 12
  • 84
  • 109
  • 1
    Some remarks: 1. The `shape` parameter to the `ndarray` constructor may simply be an integer instead of a tuple. 2. `dtype=float` is the default. 3. In Python 3, you can omit the arguments to `super()` in this case. Combining these remarks, the first two lines of the constructor reduce to `super().__init__(size)`. – Sven Marnach Jan 21 '11 at 15:20
  • Oh, and I just notice that your `super()` call is wrong -- should be `super(Array, self)` in Python 2.x. – Sven Marnach Jan 21 '11 at 16:43
  • You're right, I've edited my answer. (The constructor signature is undocumented, as usual.) – Philipp Jan 21 '11 at 18:52
  • @Philipp: It's documented in the docstring of the class, not of the constructor itself. But this is actually where the doc belongs. – Sven Marnach Jan 21 '11 at 20:09
  • @Sven: What I mean is that the doc says `shape` has to be a tuple, and doesn't say `dtype` defaults to `float`. – Philipp Jan 21 '11 at 21:39
  • @Philipp: My docstring *does* give me the default value for `dtype`. (Maybe different version?) And a shape can *always* be a tuple or an integer, but the doc indeed says "tuple of ints" in this case. – Sven Marnach Jan 21 '11 at 21:43
  • @Sven: Indeed I missed the `dtype=float` declaration. Regarding `shape`, I was puzzled because the last example uses a one-element tuple instead of an integer. – Philipp Jan 21 '11 at 21:50
  • @Philipp: it won't work, since there is no `__init__` method in `ndarray`. I know that in Python 3, it attempts to call `object.__init__` (and fails because of the wrong number of parameters). Thank you for the comments about the `zeros` though - it helps me understand where it came from. – max Jan 22 '11 at 21:11
  • @max: It *does* work, I've tested it, no need to downvote! The call to the non-existing `ndarray` constructor is wrong and I've removed it. The array is correctly initialized by the `__new__` method. – Philipp Jan 22 '11 at 21:43
6

If you don't like ndarray interface then don't inherit it. You can define your own interface and delegate the rest to ndarray and numpy.

import functools
import numpy as np


class Array(object):

    def __init__(self, size):
        self._array = np.zeros(size)

    def __getattr__(self, attr):
        try: return getattr(self._array, attr)
        except AttributeError:
            # extend interface to all functions from numpy
            f = getattr(np, attr, None)
            if hasattr(f, '__call__'):
                return functools.partial(f, self._array)
            else:
                raise AttributeError(attr)

    def allzero(self):
        return np.allclose(self._array, 0)


a = Array(10)
# ndarray doesn't have 'sometrue()' that is the same as 'any()' that it has.
assert a.sometrue() == a.any() == False
assert a.allzero()

try: a.non_existent
except AttributeError:
    pass
else:
    assert 0
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • Very neat. I assume that all `numpy` global functions have the same semantics when called with `ndarray` argument as the same-name instance method of `ndarray`. – max Jan 22 '11 at 21:53
  • To change to Python 3, I just remove `(object)` from the class definition, correct? [Not that leaving it would cause problems.] – max Jan 22 '11 at 22:17
  • @max: the code works without changes on both 2.x and Python 3. – jfs Jan 22 '11 at 23:35
  • I seem to be missing something here. You don't derive from `ndarray`, but if any attribute is requested, you first try to get the attribute from the `ndarray` instance. Wouldn't it be much easier to derive from `ndarray` in the first place? You still can add the magic to add the module level functions in numpy as methods, but I don't think this is a good idea either, because most of them **are** already methods, and the rest does not make sense as a method (with very few exceptions). – Sven Marnach Jan 22 '11 at 23:52
  • 2
    @Sven Marnach: The premise is that the OP finds `ndarray` annoying to use. Delegation is a more flexible approach to change class interface than inheritance. The code is not a final solution but it is just a mere skeleton example that allows to extend it to use black-, white- list approaches, etc. I agree about module level functions. The code shows *how* you can do it; it doesn't advocate that it is a sensible thing to do. – jfs Jan 23 '11 at 02:28
  • @J.F.Sebastian: Thanks for the clarification -- I completely agree. – Sven Marnach Jan 23 '11 at 03:38
4

Inheritance of ndarray is little bit tricky. ndarray does not even have method __init(self, )___, so it can't be called from subclass, but there are reasons for that. Please see numpy documentation of subclassing.

By the way could you be more specific of your particular needs? It's still quite easy to cook up a class (utilizing ndarray) for your own needs, but a subclass of ndarray to pass all the numpy machinery is quite different issue.

It seems that I can't comment my own post, odd
@Philipp: It will be called by Python, but not by numpy. There are three ways to instantiate ndarray, and the guidelines how to handle all cases is given on that doc.

eat
  • 7,440
  • 1
  • 19
  • 27
  • 1
    I seem to have missed this document. But why is the `__init__` method in my example called even though `ndarray` had `__new__`? – Philipp Jan 21 '11 at 21:47
  • @Philipp: see my comment to your answer. – max Jan 22 '11 at 21:05
  • I just wanted my own class that I fully control how it's instantiated. You're right subclassing ndarray is too hard. I'll follow the container approach in @J.F. Sebastian. – max Jan 22 '11 at 21:20