I want to run two functions which in turn run a query to two different databases. Both functions take the same args and it seems I could speed up this by utilising multiprocessing.
However, I am not quite sure what multiprocessing class I should use.
I initially looked at the Pool class and it seems I could use map_async()
. However, I also saw in this answer, that map_async()
in a similar use-case does not actually let the functions run in parallel does anybody know why?: https://stackoverflow.com/a/44155077/7468886
If this is the case, is there any other solutions that can be used?
Below are the two functions. After the results are returned, I consolidate them.
sql_manager = SQLManager()
sql_results = sql_manager.search_data(
title=title,
type=type,
release_year=release_year)
mongo_manager = MongoManager()
mongo_results = mongo_manager.search_data(
title=title,
type=type,
release_year=release_year)
Edit:
The search_data()
functions do not run a simple query but actually the following:
- Constructs a query string based on args
- Retrieves a class member database connection
- Initialises a cursor
- Executes the query
- Iterates through each row and creates a dict from each row
- Appends this dict into a list
- Returns the list
Since the function does all of this would it lend itself to multiprocessing still rather than multithreading?