5

Am working on a gae app using python. The app involves some crowd-sourced data collection system and data used in the app is submitted by users all-over the country. Now, am using the default quotas (Free) but am faced with a problem of ensuring at least 99% up-time for my app.

The challenge is that Google blocks any further requests being routed to your app once you exhaust your allocated quotas, and during a recent testing spree, one person was able to build an automated posting script that quickly exhausted the CPU quota - after that, the app would only serve HTTP 403 Forbidden status code for the request instead of calling a request handler. Now, I have patched the system not to allow automated postings, but how can I guarantee that human users don't cause a similar "blackout" at production time?

I know of the Quota API, but am thinking that can only give me profiling info for my app, I want a way of slowing down the rate of requests (e.g per minute for the per minute quotas) without serving error pages or blackouts.

Any suggestions?

systempuntoout
  • 71,966
  • 47
  • 171
  • 241
JWL
  • 13,591
  • 7
  • 57
  • 63
  • what would you show to the users instead of an error message, if you had gone over quota for that minute? Or is your plan to just make each request take longer and longer the closer you get to the quota? – Peter Recore Feb 08 '11 at 21:47
  • Sign up for billing. Set a daily billing limit that's sufficient for the amount of traffic that you expect to get. – Nick Johnson Feb 09 '11 at 03:11
  • @Nick The app is non-commercial (it's a community aid thing), so a billing deal is out for the moment. – JWL Feb 09 '11 at 06:42
  • @Peter : Actually, after implementing the use of the [TaskQueue](http://code.google.com/intl/it/appengine/docs/python/taskqueue/overview.html), the problem is now solved; someone posts, is immediately redirected to another page, but the actual posting (persisting to datastore) and processing takes place later in the task-queue – JWL Feb 09 '11 at 06:47
  • 2
    @mcn I'm happy it helped. Oh, the recommendation of @Nick is good; you can enable billing defining a 0 dollar budget per day. In this way you would get all the benefits of billing (quotas limits are ridiculously unleashed) without paying a penny. – systempuntoout Feb 09 '11 at 13:22
  • 1
    @mcnemesis @systempuntoout Not to mention, you get a lot of quota for relatively little money. I have a couple of apps I pay modest amounts for in order to provide a community service. – Nick Johnson Feb 09 '11 at 13:45
  • @mcnemesis. that's great. I was misunderstanding how the quota problem was affecting you. I see now you have plenty of quota for just serving pages, but were running out of quota for the back end stuff. – Peter Recore Feb 09 '11 at 15:28

1 Answers1

5

One common solution of this problem is to delegate the tasks to a rate limited taskqueue.

For example:

queue:
- name: mail-throttle
  rate: 2000/d
  bucket_size: 10
- name: background-processing-throttle
  rate: 5/s

In this way you can control the usage of all the parts of your application forcing them to stay in the range of the available quotas.

A couple of caveats:
1. Queues deliver a best effort FIFO order
2. Enqueuing/Execution of a task counts toward several quotas

systempuntoout
  • 71,966
  • 47
  • 171
  • 241
  • Thanks, didn't know about taskqueue am only wondering whether the resources used by this service still get routed to my "bill" (well, even if am using the free quota arrangement). – JWL Feb 08 '11 at 10:33
  • @mcnemesis: Tasks in Task Queue still count against your quotas. They just (among other things) let you limit the rate at which you perform resource intensive tasks. – shang Feb 08 '11 at 10:53
  • @shang: Indeed, it's limitation of the rate that sparked my initial concerns. About quota usage, am still at the mercy of my users :-/ – JWL Feb 08 '11 at 11:46
  • @mcnemesis you could code another throttling layer filtering per IP allowing a certain rate of requests for example; you would just need *http header*, *memcache* and the *time* module. – systempuntoout Feb 08 '11 at 12:01
  • @systempuntoout: This might have been good, but in my country (Uganda) most users work from behind a common IP - say that of their ISP or gateway or something like that. Entire millions of users could hide behind an IP, what do I do then since I don't want to filter "innocent" requests? – JWL Feb 08 '11 at 12:13
  • 2
    @mcn there's nothing you can do, [NAT breaks the internet](http://www.unicom.com/blog/entry/155) – systempuntoout Feb 08 '11 at 13:22
  • 1
    @sys true, I hope emptying of the IPV4 bucket will make it possible for IPV6 to save the day -- things might change. – JWL Feb 08 '11 at 13:31
  • @mcnemesis Unfortunately, IPv6 will introduce the reverse problem: everyone will have a /64, and more will be easy to obtain. – Nick Johnson Feb 09 '11 at 13:45