4

Google App Engine limits urlfetch.fetch() responses to 1MB. Is there any workaround of this (switching to paid version maybe)?

I'm using Python and if it's possible to provide an example that would be great.

David Underhill
  • 15,896
  • 7
  • 53
  • 61
Lipis
  • 21,388
  • 20
  • 94
  • 121

2 Answers2

3

With the brand new SDK 1.4.0 you can download 32MByte; keep in mind that you still have the 10 seconds Deadline limit though ;-) . deadline can be up to a maximum of 60 seconds for request handlers and 10 minutes for tasks queue and cron job handlers.

URLFetch allowed response size has been increased, up to 32 MB. Request size is still limited to 1 MB.

systempuntoout
  • 71,966
  • 47
  • 171
  • 241
2

No, you cannot fetch more than 1MB per URL fetch (even if you enable billing). However, you might be able to fetch portions of the target URL using the Range header and then combine these pieces. This might even be faster since you could fetch each 1MB chunk simultaneously (using asynchronous fetches).

David Underhill
  • 15,896
  • 7
  • 53
  • 61
  • +1 Great answer. Do you have any info on support for `Content-Range` on various public services like Flickr, DropBox, etc..? – Peter Knego Nov 05 '10 at 19:38
  • Sorry, I meant `Range` (not `Content-Range`). And yes, both Flickr and DropBox support it. I'm sure many others do too. – David Underhill Nov 05 '10 at 20:06
  • how can fetch only the headers to get at least the size of it. I tried to put the `allow_truncated=True` but on a really big file from Dropbox didn't work. With a smaller one I got a chunk of 1MB.. – Lipis Nov 05 '10 at 21:12
  • Ok I got only the header via the `method='HEAD'` but there is nothing there to indicate the size :( is there another way of setting all the ranges? Because I need to know the total size upfront.. right? – Lipis Nov 05 '10 at 21:17
  • The server should set the `Content-Length` header whenever possible. If you don't know the size up front, then I suppose you could still try fetching 1MB chunks until you receive a response back which contains less data the maximum amount (assumedly because you hit the end of the content). – David Underhill Nov 05 '10 at 21:55