5

interp1d works excellently for the individual datasets that I have, however I have in excess of 5 million datasets that I need to have interpolated.

I need the interpolation to be cubic and there should be one interpolation per subset.

Right now I am able to do this with a for loop, however, for 5 million sets to be interpolated, this takes quite some time (15 minutes):

interpolants = []
for i in range(5000000):             
    interpolants.append(interp1d(xArray[i],interpData[i],kind='cubic'))

What I'd like to do would maybe look something like this:

interpolants = interp1d(xArray, interpData, kind='cubic')

This however fails, with the error:

ValueError: x and y arrays must be equal in length along interpolation axis.

Both my x array (xArray) and my y array (interpData) have identical dimensions...

I could parallelize the for loop, but that would only give me a small increase in speed, I'd greatly prefer to vectorize the operation.

15 Volts
  • 1,946
  • 15
  • 37

1 Answers1

0

I have also been trying to do something similar over the past few days. I finally managed to do it with np.vectorize, using function signatures. Try with the code snippet below:

fn_vectorized = np.vectorize(interpolate.interp1d,
                                     signature='(n),(n)->()')
interp_fn_array = fn_vectorized(x[np.newaxis, :, :], y)

x and y are arrays of shape (m x n). The objective was to generate an array of interpolation functions, for row i of x and row i of y. The array interp_fn_array contains the interpolation functions (shape is (1 x m).

Engineero
  • 12,340
  • 5
  • 53
  • 75
joseph praful
  • 171
  • 1
  • 16