0

I'm following the docs and yet it appears the requests are still being made synchronously.

https://cloud.google.com/appengine/docs/standard/python/issue-requests

Here is my code:

rpcs = []
for url in urls:
    rpc = urlfetch.create_rpc()
    urlfetch.make_fetch_call(rpc, url)
    rpcs.append(rpc)
result = []
for rpc in rpcs:
    result.append(rpc.get_result().content)
return result

I did some profiling and compared using requests.get and they both take exactly the same amount of time.

The urls i'm fetching are from different sites so I'm sure that I don't have concurrent limitations on the server side.

Running on GAE Standard, Python 2.7

Marco Yammine
  • 323
  • 1
  • 10
  • I followed the [docs](https://cloud.google.com/appengine/docs/standard/python/issue-requests#issuing_an_asynchronous_request) myself and can confirm that the calls are made asynchronously. Why you think requests are still synchronous? I believe that you are missing the point here. [Difference](https://stackoverflow.com/questions/16715380/what-is-difference-between-asynchronous-http-request-and-synchronous-http-reques) between async and sync is that sync request blocks the client until operation is complete where the other doesn't. – komarkovich Jul 01 '18 at 08:45
  • @komarkovich tx for taking the time to look into it. I fully understand async vs sync. What perhaps I might not be getting right is whether urlfetch service allows concurrent requests. I can tell that the urlfetch requests are not concurrent by looking at the logs. It iterates through each url and waits for the response before moving to the next. I can also tell by the fact that one of these fetch calls takes 200ms on avg. Simple math says that 6 such calls should take 200/300 ms max if they are done concurrent instead the operation takes 1200ms. :( – Marco Yammine Jul 02 '18 at 12:22
  • `urlfetch` requests are concurrent. Please post the logs and code which contradicts this so I can take a look into it. – komarkovich Jul 05 '18 at 09:13
  • @komarkovich added working code in the answer. Thanks again for looking into it. – Marco Yammine Jul 05 '18 at 13:25

1 Answers1

1

I got it working but for some reason only with callbacks. Also It only works on production and not on local env. :D. Here is the working code:

from google.appengine.api import urlfetch
import functools


class ClassName(object):

    responses = []

    def fetch_concurrent_callback(self, rpc):
        response = rpc.get_result()
        json_response = json.loads(response.content)
        self.responses.append(json_response)

    def fetch_concurrent(self, urls):
        rpcs = []
        for url in urls:
            rpc = urlfetch.create_rpc()
            rpc.callback = functools.partial(self.fetch_concurrent_callback, rpc)
            urlfetch.make_fetch_call(rpc, url)
            rpcs.append(rpc)
        for rpc in rpcs:
            rpc.wait()
        return self.responses
Marco Yammine
  • 323
  • 1
  • 10
  • Could you please edit the comment, for future reference of the community, with more details what you did to get it working. Thank you. – komarkovich Jul 04 '18 at 09:56