0

I am trying to make two functions run in parallel. The first functions makes an API call and returns the json output and the second function makes a call to database and returns the data captured.

I have the following code -

import multiprocessing
ret = {'db': None, 'api':None}

def db_call(queue=None):
    engine = db.create_engine('mysql+pymysql://{}:{}@{}/{}'.format(user, password, host, database))
    dbConnection = engine.connect()
    df_aws_accounts = pd.read_sql(query, dbConnection)
    if queue:
        queue['db'] = df_aws_accounts
    return df_aws_accounts

def api_call(queue=None):
    data = requests.get(url, verify=False)
    df = pd.DataFrame(data)
    if queue:
        queue['api'] = df
    return df

def runInParallel(*fns):
    queue = multiprocessing.Queue()
    queue.put(ret)
    proc = []
    for fn in fns:
        p = Process(target=fn,args=((queue),))
        p.start()
        proc.append(p)
    print(queue.get())
    for p in proc:
        p.join()

l = [api_call, db_call]

runInParallel(l)

when I run the above code -

Process Process-2:
Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
TypeError: 'list' object is not callable

How do I get output from runInParallel and assign that to the variables? Edit- Note - These functions are working individually. but not when I do it via runInParallel function.

Edit 2 - Updated the code based on suggestions.

Steve_Greenwood
  • 546
  • 8
  • 20
  • That;s the whole code, right? besides the import – andreis11 Apr 09 '20 at 20:33
  • @andreis11 This is the code apart from the imports, the interm steps for api_call and call_database. – Steve_Greenwood Apr 09 '20 at 20:50
  • To do this, you need to use some process communication tool, such as a `multiprocessing.Queue` or similar. If possible, it is a whole lot simpler if you can create a `multiprocessing.Pool` to use to execute your function calls, as it will automatically provide a lot of the process communication "magic" for you. – JohanL Apr 09 '20 at 21:15
  • @JohanL how do I make use of the multiprocessing.Queue with my function? – Steve_Greenwood Apr 09 '20 at 21:58
  • You need to create a queue and pass it to each process as an argument to your functions, as in e.g. this answer: https://stackoverflow.com/a/37736655/7738328 – JohanL Apr 10 '20 at 04:36
  • @JohanL - I tried it but that didn't work either. I have edited the question to reflect the code. – Steve_Greenwood Apr 10 '20 at 13:21
  • The `queue` is not a `dict` and cannot be indexed as you try. You need to do: `queue.put(...)` in your functions instead. – JohanL Apr 10 '20 at 15:21

1 Answers1

0

The main problem is that, the function runInParallel is not returning anything

But also if you want to make a function which is using multiprocessing return anything, you can create a global variable, like output, and make the function assign value to it like:

def api_call():
  global output
  output = api_data
Just_Me
  • 111
  • 8
  • If I want to return the values from runInParallel is that possible? If so what should I return and then assign? – Steve_Greenwood Apr 09 '20 at 20:49
  • @Seve_Greenwood - values from funcs or values fom processes(runInParallel)? – andreis11 Apr 09 '20 at 20:56
  • Using a `global` to get the value from a different process does **not** work. The different processes will refer to different globals, at least if the processes are "spawned", as they are in Windows. – JohanL Apr 09 '20 at 21:12