4

While trying to get multiprocessing to work (and understand it) in python 3.3 I quickly reverted to joblib to make my life easier. But I experience something very strange (in my point of view). When running this code (just to test if it works):

Parallel(n_jobs=1)(delayed(sqrt)(i**2) for i in range(200000))

It takes about 9 seconds but by increasing n_jobs it actually takes longer... for n_jobs=2 it takes 25 seconds and n_jobs=4 it takes 27 seconds.
Correct me if I'm wrong... but shouldn't it instead be much faster if n_jobs increases? I have an Intel I7 3770K so I guess it's not the problem of my CPU.

Perhaps giving my original problem can increase the possibility of an answer or solution.
I have a list of 30k+ strings, data, and I need to do something with each string (independent of the other strings), it takes about 14 seconds. This is only the test case to see if my code works. In real applications it will probably be 100k+ entries so multiprocessing is needed since this is only a small part of the entire calculation. This is what needs to be done in this part of the calculation:

data_syno = []
for entry in data:
    w = wordnet.synsets(entry)
    if len(w)>0: data_syno.append(w[0].lemma_names[0])
    else: data_syno.append(entry)
Tim
  • 2,000
  • 4
  • 27
  • 45
  • If you are using Ubuntu, check this thread: http://stackoverflow.com/questions/15639779/python-what-determines-whether-different-processes-are-assigned-to-the-same-or – oteCortes Mar 26 '13 at 15:23
  • No, I'm on Windows 7 64 bit – Tim Mar 26 '13 at 15:35

1 Answers1

2

The n_jobs parameter is counter intuitive as the max number of cores to be used is at -1. at 1 it uses only one core. At -2 it uses max-1 cores, at -3 it uses max-2 cores, etc. Thats how I read it:

from the docs:

n_jobs: int :

The number of jobs to use for the computation. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.