21

I've constructed the following little program for getting phone numbers using google's place api but it's pretty slow. When I'm testing with 6 items it takes anywhere from 4.86s to 1.99s and I'm not sure why the significant change in time. I'm very new to API's so I'm not even sure what sort of things can/cannot be sped up, which sort of things are left to the webserver servicing the API and what I can change myself.

import requests,json,time
searchTerms = input("input places separated by comma")

start_time = time.time() #timer
searchTerms = searchTerms.split(',')
for i in searchTerms:
    r1 = requests.get('https://maps.googleapis.com/maps/api/place/textsearch/json?query='+ i +'&key=MY_KEY')
    a = r1.json()
    pid = a['results'][0]['place_id']
    r2 = requests.get('https://maps.googleapis.com/maps/api/place/details/json?placeid='+pid+'&key=MY_KEY')
    b = r2.json()
    phone = b['result']['formatted_phone_number']
    name = b['result']['name']
    website = b['result']['website']
    print(phone+' '+name+' '+website)

print("--- %s seconds ---" % (time.time() - start_time))
click here
  • 814
  • 2
  • 10
  • 24
  • I think you have to consider various time factors here. First the amount of time taken by your programme to retrieve the info from the mentioned URL(this will be affected by the internet speed and the time taken by the web server to send the response) + time taken by the python to analyse that information. I would suggest to compute these two times separately and see which time is taking longer and how much variation is there.. – ρss Dec 29 '15 at 14:10
  • keep in mind that at some point you will hit Google maps' API rate limits ;) – Tommaso Barbugli Dec 29 '15 at 14:31

5 Answers5

21

You may want to send requests in parallel. Python provides multiprocessing module which is suitable for task like this.

Sample code:

from multiprocessing import Pool

def get_data(i):
    r1 = requests.get('https://maps.googleapis.com/maps/api/place/textsearch/json?query='+ i +'&key=MY_KEY')
    a = r1.json()
    pid = a['results'][0]['place_id']
    r2 = requests.get('https://maps.googleapis.com/maps/api/place/details/json?placeid='+pid+'&key=MY_KEY')
    b = r2.json()
    phone = b['result']['formatted_phone_number']
    name = b['result']['name']
    website = b['result']['website']
    return ' '.join((phone, name, website))

if __name__ == '__main__':
    terms = input("input places separated by comma").split(",")
    with Pool(5) as p:
        print(p.map(get_data, terms))
Łukasz Rogalski
  • 22,092
  • 8
  • 59
  • 93
  • 3
    I mean't to ask, what does everything contained within the if. Like Pool(5) and p.map – click here Dec 29 '15 at 15:26
  • 7
    I'll provide some explanation although it probably won't help you seeing as it is 2.5 years too late: `with Pool..` creates the `Pool` object under the control of a context manager, meaning that the object will be destroyed and cleanup code called when the program exits the scope of the `with` statement. `Pool(5)` creates a thread pool with 5 threads that are all capable of running independently. This means that the second HTTP request you make does not have to wait for the first HTTP request to be returned - so instead of 5 200ms operations in series, you do 5 200ms waits all at once. – Dagrooms May 22 '18 at 17:39
14

Use sessions to enable persistent HTTP connections (so you don't have to establish a new connection every time)

Docs: Requests Advanced Usage - Session Objects

Sander van den Oord
  • 10,986
  • 5
  • 51
  • 96
Joe Heffer
  • 755
  • 7
  • 9
  • 3
    This gave me about a 33% speed increase! Thanks! (136s -> 91s, for reference) – wjandrea Aug 11 '21 at 21:09
  • 3
    Dead link. Tried to submit an edit but the edit queue is full? Here's the new [link](https://requests.readthedocs.io/en/latest/user/advanced/#session-objects). – Kevin M Jun 02 '22 at 16:27
5

Most of the time isn't spent computing your request. The time is spent in communication with the server. That is a thing you cannot control.

However, you may be able to speed it along using parallelization. Create a separate thread for each request as a start.

from threading import Thread

def request_search_terms(*args):
    #your logic for a request goes here
    pass

#...

threads = []
for st in searchTerms:
    threads.append (Thread (target=request_search_terms, args=(st,)))
    threads[-1].start()

for t in threads:
    t.join();

Then use a thread pool as the number of request grows, this will avoid the overhead of repeated thread creation.

StoryTeller - Unslander Monica
  • 165,132
  • 21
  • 377
  • 458
1

its a matter of latency between client and servers , you can't change anything in this way unless you use multiple server location ( the near server to the client are getting the request ) .

in term of performance you can build a multithreding system that can handel multiple requests at once .

1

There is no need to do multithreading yourself. grequests provides a quick drop-in replacement for requests.

qwr
  • 9,525
  • 5
  • 58
  • 102