0

I'm trying to speed up API requests using multi threading. I don't understand why but I often get the same API response for different calls (they should not have the same response). At the end I get a lot of duplicates in my new file and a lot of rows are missing.

example : request.post("id=5555") --> response for the request.post("id=444") instead of request.post("id=5555")

It looks like workers catch the wrong responses. Have anybody faced this issue ?

` def request_data(id, useragent): - ADD ID to data and useragent to headers -
time.sleep(0.2) resp = requests.post( -URL-, params=params, headers=headerstemp, cookies=cookies, data=datatemp, )
return resp

df = pd.DataFrame(columns=["ID", "prenom", "nom", "adresse", "tel", "mail", "prem_dispo",         "capac_acc", "tarif_haut", "tarif_bas", "presentation", "agenda"])

ids = pd.read_csv('ids.csv')
ids.drop_duplicates(inplace=True)
ids = list(ids['0'].to_numpy())

with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
    future_to_url = {executor.submit(request_data, id, usera): id for id in ids} 
    for future in concurrent.futures.as_completed(future_to_url):
        ok=False
        while(ok==False):                
             try:                                                              
                 resp = future.result()
                 ok=True
             except Exception as e:
                 print(e)
        df.loc[len(df)] = parse(json.loads(resp))

`

I tried using asyncio, first response from Multiple async requests simultaneously but it returned the request and not the API response...

zeffz
  • 1

0 Answers0