2

I run parallel dataframe processing on Windows 10 in the fashion:

Parallel(n_jobs=28)(
    delayed(function)(group) for name, group in grouped_data)

SMT is disabled in Ryzen-Master and I see only 32 physical cores in any monitoring software. When the script is being executed I see only 16 cores loaded 80-90%, and all the other 16 cores idle. If I enable SMT (32 physical cores and 64 logical) and launch it with

n_jobs = 60

then I see the same picture: only fist 16 physical and first 32 logical cores are loaded, others idle.

Because some business limitations I can't install and check it on Linux. What is the problem?

Ivan Sudos
  • 1,423
  • 2
  • 13
  • 25
  • what happens if you set `n_jobs=-1` What if you use `backend` parameter with value `multiprocessing`? – Grzegorz Bokota Dec 04 '19 at 22:49
  • with SMT enabled it causes ValueError: need at most 63 handles, got a sequence of length 65. with SMT disabled it doesn't change the situation. backend is loki. backend multiprocessing causes system reboot soon after calculations start – Ivan Sudos Dec 05 '19 at 00:16
  • The computation time for your target-function might be relatively short compared to the parallelization overhead. AFAIK `joblib`'s both mp-backends feed their workers over a single queue (like `multiprocessing.Pool`), which can become the bottleneck in case you also transfer relatively big amounts of data. Basically the scenario I describe for `multiprocessing.Pool` here in [Chapter 8 Reality Check. 3rd RUN](https://stackoverflow.com/a/54813527/9059420). It's just one possible scenario, we don't know what you actually do in `function`, so it could also be due to long I/O operations. – Darkonaut Dec 05 '19 at 00:48
  • mean job computation time is about 15 seconds, but it look suspicious that it limits right on the half of the cores (0-15) and never uses 16-31 – Ivan Sudos Dec 05 '19 at 08:34
  • Try with making the dataframe half the size for comparison. Still the same load? – Darkonaut Dec 05 '19 at 13:18
  • yeah, same problem. I see 100% (32) cores usage when doing gradient boosting for example and rendering or mutlithreading in C, but this doesn't work for python mutliprocesses – Ivan Sudos Dec 05 '19 at 14:42
  • Do you actually provide more than 16 groups in the first place? – Darkonaut Dec 05 '19 at 15:43
  • the code is actually the following: grouped_data = Parallel(n_jobs=28)( delayed(function)(group) for name, group in grouped_data) – Ivan Sudos Dec 05 '19 at 17:25
  • I'm asking for the number of `group`s in `grouped_data`, the input. What does `len(grouped_data)` give you before you try to parallelize? – Darkonaut Dec 05 '19 at 17:40
  • the length is 1121 – Ivan Sudos Dec 05 '19 at 17:55
  • Okay, so you would have enough work in there. You can set `verbose=40` within `Parallel` for some logging. How many "concurrent workers" does it report? – Darkonaut Dec 05 '19 at 18:08
  • 1
    It reports 16. It's interesting that if I launch this script twice simultaneuosly then I see 100% cores and CPU load – Ivan Sudos Dec 09 '19 at 17:52
  • yeah, doubling this script leads to 100% cpu usage and all the cores loaded both under SMT off and on. I think it is windows scheduler problem, however I still need to fix it. – Ivan Sudos Dec 09 '19 at 18:05
  • Иван Судос, from a different perspective (administrative), are all cores licensed? I once spent a long time looking at a threaded/distributed Java app I developed, only to discover that the VM server it was running on was not licensed on all cores. Granted, this was a virtual machine running on Windows 2012. Are you running on a VM? Is it possible for you to move the code over to another box to see if it behaves differently, or the same? – plditallo Dec 11 '19 at 22:07
  • I think they all are legit. I'm working on personal workstation with 32 core amd threadripper on board, so it's not VM. I would like to reproduce it on VM, but the highest number of cores available on our VMs is 16. I'm also almost sure that on Linux I shall have all the cores working but unfortunatelly I need to parallelize this code on Windows. – Ivan Sudos Dec 11 '19 at 22:48

0 Answers0