I am reading through a Numpy Tutorial and it is saying that a sample code like this:
>>> X = np.ones(10, dtype=np.int)
>>> Y = np.ones(10, dtype=np.int)
>>> A = 2*X + 2*Y
is slow because it creates three different intermediate arrays in order to hold the values of A
, 2*X
, and 2*Y
.
Instead it is suggested that if speed is an issue perform the same calculation like this:
>>> X = np.ones(10, dtype=np.int)
>>> Y = np.ones(10, dtype=np.int)
>>> np.multiply(X, 2, out=X)
>>> np.multiply(Y, 2, out=Y)
>>> np.add(X, Y, out=X)
Yet I don't see where the speed difference would be. In the second code, X
and Y
still appear to be created as intermediate arrays. Is the difference rather in the speed of np.multiply
instead of 2*X
?