Is it possible to use Pool.map()
on a function that contains an empty dictionary as one of its arguments? I am new to multiprocessing
and want to parallise a web-scraping function. I tried following the example from this site however it doesn't include a dictionary as one of the arguments. The multiprocess function works (it prints out the search result), however it does not append to the dictionary, after completing the process the dictionary is still empty. Looks like I have to use Manager()
however I don't know how to implement it. use of Manager() Thanks for help.
from functools import partial
from multiprocessing import Pool
from bs4 import BeautifulSoup as soup
count = 1
outerDict = dict()
emptyList = []
lstOfItems = ['Valsartan','Estrace','Norvasc','Combivent',
'Fluvirin','Kariva','Natrl','Foxamax','Vilanterol','Catapres']
def process_search():
'''a function that scrapes a site; the outerDict and emptyLst will
become populated as it scrapes the site for each item'''
def callSrch(item,outerDict,emptyList,count):
searchlink = 'http://www.asite.com'
uClient=ureq(searchlink+item)
pagehtml = uClient.read()
soupPage_ = soup(pagehtml,'html.parser')
process_search(item,soupPage_,outerDict,count,emptyList)
with Pool() as p:
prfx = partial(callSrch,outerDict=outerDict,emptyList=emptyList,count=count)
p.map(prfx, lstOfItems)