making a million api requests, it is taking too long to run , not completed even in many hours , how can i do it faster?

Question

import pandas as pd
import json
import requests
import time

t1=time.time()
df=pd.DataFrame()

while True:
    try:
        for i in range(110000,160000):
            response = requests.get("https://api.postalpincode.in/pincode/{}".format(i))
            data = json.loads(response.text)
            postOffices = pd.DataFrame(data[0]['PostOffice'])
            if not postOffices.empty:
                df.append(postOffices, ignore_index=True)
    except ConnectionError:
        continue

Have you tried profiling your code? 50,000 API calls is probably why you think it's "slow". Perhaps the API has some rate-limiting mechanism. — dspencer, Apr 03 '20 at 07:12
The server will not allow you to make this many requests quickly, because that makes a lot of work for the server and makes it harder for other people to use the website. You should check the API documentation first to see if there is a way to *get the information you want* with fewer requests. — Karl Knechtel, Apr 03 '20 at 07:20
[This answer](https://stackoverflow.com/questions/487258/what-is-a-plain-english-explanation-of-big-o-notation) might help. We call it Big-O Notation, but basically what @KarlKnechtel is suggesting is whether there's a way to parallelize your requests by grouping those 50,000 calls in a single SQL request (or the `pandas` equivalent). Unfortunately I don't know enough about `pandas` to help — Nathan majicvr.com, Apr 03 '20 at 07:45
i have heard about asnycio and aiohttp , which can make parallel large number api request , just dont know how to do it. — pankit Gujjar, Apr 03 '20 at 08:41
can you please help if i can make large number of parallel requests. — pankit Gujjar, Apr 03 '20 at 08:41

score 2 · Answer 1 · answered Apr 03 '20 at 07:39

Before hammering a free API service for data scraping purposes, some simple arithmetic would serve you well.

1M requests @ 1ms = 1000s

1M requests @ 50ms = ~14h

etc...

I imagine you'll come up against rate limiting, so maybe you'd be better off scraping the data from the directory listing on their website.

making a million api requests, it is taking too long to run , not completed even in many hours , how can i do it faster?

1 Answers1