14

Recently I am tuning up some of my machine learning pipeline. I decided to take advantage of my multicore processor. And I ran cross-validation with param n_jobs=-1. I also profiled it and what was suprise for me: the top function was:

{method 'acquire' of 'thread.lock' objects}

I was not sure if it was my fault due to operations I do in Pipeline. So I decided to make small experiment:

pp = Pipeline([('svc', SVC())])
cv = GridSearchCV(pp, {'svc__C' : [1, 100, 200]}, jobs=-1, cv=2, refit=True)
%prun cv.fit(np.random.rand(1e4, 100), np.random.randint(0, 5, 1e4))

The output is :

2691 function calls (2655 primitive calls) in 74.005 seconds
Ordered by: internal time

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   83   43.819    0.528   43.819    0.528 {method 'acquire' of 'thread.lock' objects}
    1   30.112   30.112   30.112   30.112 {sklearn.svm.libsvm.fit}

I wonder what is the cause of such behavior. And if it is possible to speed it up a little bit.

Michal
  • 2,074
  • 2
  • 22
  • 29

1 Answers1

9

The profiler is only telling you what the main process is doing, while its child processes are doing all the work. Setting verbose=2 on GridSearchCV may give better output than %prun in this case.

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • I don't really understand this answer. Try what you suggest on a practical sized data set and the verbosity level will not show you anything about where the time is spent. In my case, the verbosity level shows ~30 sec of computation, which is how long a single cross validation score took to compute, but shows nothing about the >4 minuets time gap between two cross val scores, where nothing is being printed regardless of verbosity level. – Kai Oct 17 '17 at 18:07