Why GridSearchCV spends more than 50% time on {method 'acquire' of 'thread.lock' objects}?

Question

Recently I am tuning up some of my machine learning pipeline. I decided to take advantage of my multicore processor. And I ran cross-validation with param n_jobs=-1. I also profiled it and what was suprise for me: the top function was:

{method 'acquire' of 'thread.lock' objects}

I was not sure if it was my fault due to operations I do in Pipeline. So I decided to make small experiment:

pp = Pipeline([('svc', SVC())])
cv = GridSearchCV(pp, {'svc__C' : [1, 100, 200]}, jobs=-1, cv=2, refit=True)
%prun cv.fit(np.random.rand(1e4, 100), np.random.randint(0, 5, 1e4))

The output is :

2691 function calls (2655 primitive calls) in 74.005 seconds
Ordered by: internal time

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   83   43.819    0.528   43.819    0.528 {method 'acquire' of 'thread.lock' objects}
    1   30.112   30.112   30.112   30.112 {sklearn.svm.libsvm.fit}

I wonder what is the cause of such behavior. And if it is possible to speed it up a little bit.

Yes it did. From: `1 loops, best of 3: 2min 7s per loop` using one job to `1 loops, best of 3: 1min 20s per loop` using four. — Michal, Nov 20 '13 at 13:19
Then probably the profiler is only telling you what the main process is doing, while its child processes are doing all the work. — Fred Foo, Nov 20 '13 at 13:30
Indeed, the processing is actually done the multiprocessing children. You would need a way to plug a profiler onto one of them instead. It don't think the `%prun` magic can be made to profile the child processes though. — ogrisel, Nov 25 '13 at 13:04

score 9 · Accepted Answer · answered Mar 20 '14 at 10:52

9

The profiler is only telling you what the main process is doing, while its child processes are doing all the work. Setting verbose=2 on GridSearchCV may give better output than %prun in this case.

answered Mar 20 '14 at 10:52

Fred Foo

355,277
75
744
836

I don't really understand this answer. Try what you suggest on a practical sized data set and the verbosity level will not show you anything about where the time is spent. In my case, the verbosity level shows ~30 sec of computation, which is how long a single cross validation score took to compute, but shows nothing about the >4 minuets time gap between two cross val scores, where nothing is being printed regardless of verbosity level. – Kai Oct 17 '17 at 18:07

Why GridSearchCV spends more than 50% time on {method 'acquire' of 'thread.lock' objects}?

1 Answers1

Linked