The correct way to use np.vectorize
is to not use it - unless you are dealing with a function that only accepts scalar values, and you don't care about speed. When ever I've tested it, explicit Python iteration has been faster.
At least that's the case when working with numpy
arrays. With DataFrames
, things become more complicated, since extracting Series and recreating frames can skew the timings substantially.
But lets look at your example in some detail.
Your sample frame:
In [177]: test_df = pd.DataFrame({'a': np.arange(5), 'b': np.arange(5), 'c': np.arange(5)})
...:
In [178]: test_df
Out[178]:
a b c
0 0 0 0
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
In [179]: def myfunc(a, b):
...: return a+b
...:
Your apply:
In [180]: test_df.apply(lambda x: x+3, raw=True)
Out[180]:
a b c
0 3 3 3
1 4 4 4
2 5 5 5
3 6 6 6
4 7 7 7
In [181]: timeit test_df.apply(lambda x: x+3, raw=True)
186 µs ± 524 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
# 1.23 ms ± 13.9 µs per loop without **raw**
I get the same thing by simply using the frame's own addition operator - and it is faster. Ok, for a more general function that won't work. Your use of apply
with default axis and raw
implies you have a function that only works with one column at a time.
In [182]: test_df+3
Out[182]:
a b c
0 3 3 3
1 4 4 4
2 5 5 5
3 6 6 6
4 7 7 7
In [183]: timeit test_df+3
114 µs ± 3.02 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
With raw
you are passing numpy arrays to the lambda
. Array for the whole frame is:
In [184]: test_df.to_numpy()
Out[184]:
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]])
In [185]: test_df.to_numpy()+3
Out[185]:
array([[3, 3, 3],
[4, 4, 4],
[5, 5, 5],
[6, 6, 6],
[7, 7, 7]])
In [186]: timeit test_df.to_numpy()+3
13.1 µs ± 119 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
That's much faster. But to return a frame takes time.
In [188]: timeit pd.DataFrame(test_df.to_numpy()+3, columns=test_df.columns)
91.1 µs ± 769 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [189]:
Testing vectorize
.
In [218]: f=np.vectorize(myfunc)
f
applies myfunc
to each element of the input array iteratively. It has a clear performance disclaimer.
Even for this small array it is slow compared to direct application of the function to the array:
In [219]: timeit f(test_df.to_numpy(),3)
42.3 µs ± 123 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Passing the frame itself
In [221]: timeit f(test_df,3)
69.8 µs ± 1.98 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [223]: timeit pd.DataFrame(f(test_df,3), columns=test_df.columns)
154 µs ± 2.11 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
And iteratively applying to columns - much slower:
In [226]: [f(test_df.iloc[:,i], 3) for i in range(test_df.shape[1])]
Out[226]: [array([3, 4, 5, 6, 7]), array([3, 4, 5, 6, 7]), array([3, 4, 5, 6, 7])]
In [227]: timeit [f(test_df.iloc[:,i], 3) for i in range(test_df.shape[1])]
477 µs ± 17.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
but a lot of that extra time comes from "extracting" columns:
In [228]: timeit [f(test_df.to_numpy()[:,i], 3) for i in range(test_df.shape[1])]
127 µs ± 357 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Quoting the np.vectorize
docs:
Notes
-----
The `vectorize` function is provided primarily for convenience, not for
performance. The implementation is essentially a for loop.