2

I need an algorithm that executes a file 'test.py' on different computers at the same time (with different parameters), and on each computer, vectorized methods, e.g. from the numpy package, should be able to use multiple cores.

A minimal (not working) example, consists of the following two files.

file A: test.py

import numpy
import os     #to verify whether process is allowed to use all cores
os.system("taskset -p 0xff %d" % os.getpid())   
a = numpy.random.random((10000,10000))   #some random matrix
b = numpy.dot(a,a)                       #some parallelized calculation

and file B: control.py

import subprocess
import os
callstring = "python " + str(os.getcwd()) + "/test.py" # console command to start test.py
sshProcess = subprocess.Popen(str("ssh <pc-name>"),
                              stdin=subprocess.PIPE,
                              shell=True)
sshProcess.stdin.write(str(callstring))    # start file.py
sshProcess.stdin.close()

Now, when I run control.py, the file test.py is beeing executed, however, only with a single core.

If I were to run test.py directly from the console python test.py (which I dont want), multiple cores are used instead.

Unfortunately, I am new to the subprocess extension, and also I am no expert on Linux systems. However, I was able to gain the following knowledge so far:

  • Using subprocess.call("python test.py", shell=True) would work, i.e. multiple cores are used then. However, I need the ssh to address other computers.
  • Using the console manually, i.e. going via ssh to a different computer and running python test.py also gives the desired result: multiple cores are used. However, I need to automate this step, and hence, I would like to create multiple of these 'ssh-consoles' with python code.
  • Core affinity seems not to be the problem (it's a typical numpy problem), as os.system("taskset -p 0xff %d" % os.getpid()) produces the output 'current affinity ff, new affinity ff' = 8 possible cores

Therefore, it seems to be an issue of Popen in combination with ssh !?

Do you have any ideas or advices? Thanks for your help!

EDIT / ADDENDUM: I found out that parallelized methods of the package 'multiprocessing` DO run un multiple cores. So it seems to be a numpy problem again. I apologize for not having tested this before!

I am using Python 2.7.12 and Numpy 1.11.1

newbie
  • 51
  • 6
  • How do you know that `numpy.dot(a,a)` does "some parallelized calculation"? – Flurin Apr 11 '17 at 10:07
  • Checkout [ParallelProgramming](https://scipy.github.io/old-wiki/pages/ParallelProgramming) ("Use parallel primitives") and more specifically check whether numpy on the machine you're connecting to was compiled using BLAS. – Flurin Apr 11 '17 at 10:10
  • @Flurin: from the numpy documentation, and observing the cores in htop – newbie Apr 11 '17 at 10:53
  • @Flurin: Thx for the answer. Yes, all the machines have BLAS. Even if I ssh to my own computer, numpy stops using multiple cores!!! – newbie Apr 11 '17 at 10:55

1 Answers1

1

I could solve my problem:

For some reason I still dont understad - as I said, my Linux knowledge is a catastrophy - a different python is called within the above Popen environment.

In the console, which python gives '../anaconda2/bin/python', but ssh <the very same computer> which python gives '/usr/bin/python'.

Anaconda seems to use MKL libraries for parallel computations, which are unknown for the python within /usr/bin.

So the origin of the problem was using ssh without knowing what it does. The same problem has already arised: Why does an SSH remote command get fewer environment variables then when run manually?

Thanks again for your interest.

Community
  • 1
  • 1
newbie
  • 51
  • 6
  • Thanks so much for sharing the solution! you can mark it as the accepted answer so that the question is marked as solved. – Kos Apr 11 '17 at 13:13