This is not a real issue, but I'd like to understand:
- running sklearn from Anaconda distrib on a Win7 4 cores 8 GB system
- fitting a KMeans model on a 200.000 samples*200 values table.
- running with n-jobs = -1: (after adding the
if __name__ == '__main__':
line to my script) I see the script starting 4 processes with 10 threads each. Each process uses about 25% of the CPU (total: 100%). Seems to work as expected - running with n-jobs = 1: stays on a single process (not a surprise), with 20 threads, and also uses 100% of the CPU.
My question: what is the point of using n-jobs (and joblib) if the the library uses all cores anyway? Am I missing something? Is it a Windows-specific behaviour?