1

I am calculating spatial KDE using scipy.stats.kde.gaussian kde. However, its evaluation takes quite a lot of time - 70% of my script time, which is 26s for 10000 rows. I'd like to make it faster. Here is my original code:

from scipy.stats import kde
kernel = kd.gausian_kde(values, bw_method=.05)
result = kernel(positions)

Basing on Speed up sampling of kernel estimate, I've implemented multiprocessing:

SKERNEL = None

# sets global kernel function 
# - multiprocessing requires my function to be top-level module function
setKernel()

def calc_kernel(sample):
    return SKERNEL(sample)

def genKernel(elements):

    cores = mp.cp_count()
    torun = np.array_split(elements, cores, axis=1)

    pool = mp.Pool(processes = cores)
    r = pool.map(calc_kernel, torun)
    return np.concatenate(r)

However, on the same dataset this implementation takes 36 seconds to run. using cProfiler, I can see that most of the time takes "wait" process. What am I doing wrong and how this can be modified to work faster?

Community
  • 1
  • 1
Philipp_Kats
  • 3,872
  • 3
  • 27
  • 44

1 Answers1

1

The cost of evaluating the kernel at each position depends on the density of the value array near that position. That is, splitting the points to be partitioned in equal sized arrays will not result in equal evaluation times for those subproblems; which will be very much true for every KDE-type problem that I have ever worked with.

Eelco Hoogendoorn
  • 10,459
  • 1
  • 44
  • 42
  • thanks. I thought the same: then, I guess, I need to produce much more and chunks, so that the distribution of work will be more balanced – Philipp_Kats Jun 05 '16 at 20:46
  • that could help, but if the mode of your distribution is a fairly small cluster, you might still be out of luck; computational requirements scale quadratically in the point density. perhaps randomizing your split could help as well, so that the mode doesn't end up in a single chunk by construction – Eelco Hoogendoorn Jun 06 '16 at 05:40
  • unfortunately, it still making everything slower, not faster - not sure why is that – Philipp_Kats Jun 10 '16 at 23:08
  • I guess my bad results were due to the relatively large overhead of the function: on the large data, code works ~10x faster now – Philipp_Kats Jun 13 '16 at 16:55