1

I would like to implement a function that works like the numpy.sum function on arrays as on expects, e.g. np.sum([2,3],1) = [3,4] and np.sum([1,2],[3,4]) = [4,6].

Yet a trivial test implementation already behaves somehow awkward:

import numpy as np

def triv(a, b): return a, b

triv_vec = np.vectorize(fun, otypes = [np.int])
triv_vec([1,2],[3,4])  

with result:

array([0, 0])

rather than the desired result:

array([[1,3], [2,4]])

Any ideas, what is going on here? Thx

donbunkito
  • 488
  • 5
  • 9

2 Answers2

2

You need otypes=[np.int,np.int]:

triv_vec = np.vectorize(triv, otypes=[np.int,np.int])
print triv_vec([1,2],[3,4])
(array([1, 2]), array([3, 4]))

otypes : str or list of dtypes, optional

The output data type. It must be specified as either a string of typecode characters or a list of data type specifiers. There should be one data type specifier for each output.

Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
  • Thanks first! -- Ok, now the data type is correct, but still the order tells me, that it is not vectorized as desired ([1, 2]), [3, 4]) != ([1, 3]), [2, 4]) – donbunkito May 15 '15 at 09:56
  • Notice that the answer is a tuple of arrays, something that could be used as `a, b = triv_vec([1,2], [3,4])`. That tuple has the same length as `otypes`, and the same length as the tuple returned by `triv`. – hpaulj Feb 01 '16 at 08:00
0

My original question was devoted to the fact that the vectorization is doing an internal type-cast and running an internally optimized loop and how much this would affect performance. So here is the answer:

It does, but not with only <23% the effect is not as considerable as I supposed.

import numpy as np

def make_tuple(a, b): return tuple([a, b])

make_tuple_vec = np.vectorize(make_tuple, otypes = [np.int, np.int])

v1 = np.random.random_integers(-5, high = 5, size = 100000)
v2 = np.random.random_integers(-5, high = 5, size = 100000)

%timeit [tuple([i,j]) for i,j in zip(v1,v2)] # ~ 596 µs per loop

%timeit make_tuple_vec(v1, v2) # ~ 544 µs per loop

Furthermore the tuple generating function doesn't vectorized as expected, like e.g. the map function map(make_tuple, v1, v2), which is the clear looser of the competition with a 100 times slower exectution time:

%timeit map(make_tuple, v1, v2) # ~ 64.4 ms per loop 
donbunkito
  • 488
  • 5
  • 9