I'm trying to perform Fitted Value Iteration (FVI) in python (involving approximating a 5 dimensional function using piecewise linear interpolation).
scipy.interpolate.griddata works perfectly for this. However, I need to call the interpolation routine several thousand times (since FVI is a MC based algorithm).
So basically, the set of points where the function is known is static (and large - say 32k), but the points i need to approximate (which are small perturbations of the original set) is very large (32k x 5000 say).
Is there an implementation of what scipy.interpolate.griddata does that's been ported to CUDA? alternatively, is there a way to speed up the calculation somehow?
Thanks.