4

Some NumPy functions (e.g. argmax or cumsum) can take an array as an optional out parameter and store the result in that array. Please excuse my less than perfect grasp of the terminology here (which is what prevents me from googling for an answer), but it seems that these functions somehow act on variables that are beyond their scope.

How would I transform this simple function so that it can take an out parameter as the functions mentioned?

import numpy as np

def add_two(a):
    return a + 2

a = np.arange(5)

a = add_two(a)

From my understanding, a rewritten version of add_two() would allow for the last line above to be replaced with

add_two(a, out=a)
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
Fredrik P
  • 682
  • 1
  • 8
  • 21

2 Answers2

2

In my opinion, the best and most explicit is to do as you're currently doing. Python passes the values, not the references as parameters in a function, so you can only modify mutable objects.

One way would be to do:

import numpy as np
def add_two(a, out):
    out[:] = a+2
a = np.arange(5)
add_two(a, out=a)
a

Output:

array([2, 3, 4, 5, 6])

NB. Unlike your current solution, this requires that the object passed as parameter out exists and is an array

mozway
  • 194,879
  • 13
  • 39
  • 75
  • 1
    Python only passes the references. It *never* passes by value. But, yes, you can only modify mutable object by the very definition of mutability. – Mad Physicist Dec 20 '21 at 21:38
  • @Mad looks like an issue of semantics. Check [this answer](https://stackoverflow.com/a/8140747/16343464), the reference is not passed, a new one is created to the object. This was the sense of my remark, you don't pass the **name**. – mozway Dec 20 '21 at 23:45
  • Fair enough. I think that clears it up. – Mad Physicist Dec 21 '21 at 03:39
1

The naive solution would be to fill in the buffer of the output array with the result of your computation:

def add_two(a, out=None):
    result = a + 2
    if out is None:
        out = result
    else:
        out[:] = result
    return out

The problem (if you could call it that), is that you are still generating the intermediate array, and effectively bypassing the benefits of pre-allocating the result in the first place. A more nuanced approach would be to use the out parameters of the functions in your numpy pipeline:

def add_two(a, out=None):
    return np.add(a, 2, out=out)

Unfortunately, as with general vectorization, this can only be done on a case-by-case basis depending on what the desired set of operations is.

As an aside, this has nothing to do with scope. Python objects are specifically available to all namespaces (though their names might not be). If a mutable argument is modified in a function, the changes will always be visible outside the function. See for example "Least Astonishment" and the Mutable Default Argument.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264