0

I have loaded an artificial intelligence into a class which then has a function where you can make a prediction, but if I run with multiple cores it will take longer than if I had to do it linearly as it takes a long time to transfer it to every single core. So is there a way that you don't have to transfer my class, but at the same time that every core can access it?

Right now I'm using Python's multiprocessing library to split up the processes like this:

import multiprocessing as mp

with mp.Pool() as pool:
    output = pool.map(
        parallel_process,
            [(
            link,
            question,
            48,
            crawler.webcrawler,
            self.translator,
            self.question_answering
        ) for link in links]
    )

def parallel_process(input):
    # Getting the variables
    link = inputs[0]
    question = inputs[1]
    n_sentences = inputs[2]

    # Modules
    webcrawler = inputs[3]
    translator = inputs[4]
    question_answering = inputs[5]

    # The rest of the processes
    # This is not necessary since it's the transition that takes a long time.

If you can either tell me how to do it in generel or give a code example in python, that would be awesome.

Marius Johan
  • 394
  • 3
  • 13
  • Is this question about GPUs? – Stephen C May 31 '20 at 14:15
  • Nope, I'm using CPUs to do this – Marius Johan May 31 '20 at 14:16
  • In that case, your question makes no sense. It takes almost no time to transfer code or data to the cores in a conventional system with multiple cores running a multi-threaded application. They all share the same memory. When one core writes data or code into main memory, they all see it. – Stephen C May 31 '20 at 14:19
  • I'm quite sure it's the transition because I tried tracking the time spent, and it showed that each process from the function had started running to right before it returned the result took ~5 sec, but entire process from the for-loop took ~30 sec. – Marius Johan May 31 '20 at 14:29
  • OK. So you are not using multi-threading. You are doing process-based parallelism. The bottleneck you are encountering is not "transfering the class". The bottleneck is actually the cost of starting new python processes and/or transferring data between the master and the worker processes. If you need to eliminate that, you need to switch to using (real) multi-threading. – Stephen C May 31 '20 at 14:29
  • You should be able to find Python tutorials and examples on how to do (real) multi-threading. – Stephen C May 31 '20 at 14:33
  • Spin up your processes once; design the worker to continually run and except (wait for) data from a queue. You will only incur the cost of spinning up the processes once and subsequent *predictions* will only incur the cost of transferring the data. If the bottleneck for the *linear* approach you mentioned is I/O consider using threading or asyncio instead of multiprocessing. – wwii May 31 '20 at 14:57
  • Stephan C, can I use (real) multithreading even though I'm using running a heavy computational process? – Marius Johan May 31 '20 at 16:59

0 Answers0