7

I'm having some trouble with the google app engine datastore. Ever since the new pricing model was introduced, the cost of running my app has increased massively.

The culprit appears to be "Datastore small operations", which come in at more than 20 Million ops per day!

Has anyone had this problem, I don't think I'm doing an excessive amount of key lookups, and I only have 5000 users, with roughly 10 - 20 requests per minute.

Thanks in advance!

Edit

Ok got some stats, these are after abut 3 hours. Here is what I am seeing in my dashboard, in the billing section: Appengine dashboard - billing

And here are some of the stats:

Stats

Obviously there are quite a lot of calls to datastore.get. I am starting to think that it is my design that is causing the problem. Those gets correspond to accounts. Every user has an account, but an account can be one of two types, for this I use composition. So each account entity has a link to its sub account entity. As a result when I do a search for nearby users it involves fetching the accounts using the query, and then doing a get on each account to get its sub account. The top request in the stats picture is a call that gets 100 accounts, and then has to do a get on each one. I would have thought that this was a very light query, but I guess not. And I am still confused by the number of datastore small ops being recorded in my dashboard.

Lucas Zamboulis
  • 2,494
  • 5
  • 24
  • 27
Theblacknight
  • 575
  • 5
  • 12
  • 1
    Out of curiosity, what was your typical monthly bill before and after? – Dave Nov 11 '11 at 20:27
  • My daily quota was $2, and I never hit that. Now it is $5 dollars and I am exceeding it every day. I think I would have to increase to $9 a day. – Theblacknight Nov 11 '11 at 20:30
  • Sorry, I also should have asked this, but are you using memcache at all? – Dave Nov 11 '11 at 20:33
  • No, I haven't really looked into memcache. I would have thought the datastore could handle the current amount of data for a much more reasonable price. Having said that, it's not a site I'm running, it's the backend for an app, a game, so it is quite heavy on processing. – Theblacknight Nov 11 '11 at 21:11
  • "The top request in the stats picture is a call that gets 100 accounts, and then has to do a get on each one." You should be fetching all 100 keys in one batch rather than doing individual gets. See [here](http://blog.notdot.net/2010/01/ReferenceProperty-prefetching-in-App-Engine) for an explanation of the pattern. Also, you should definitely be keeping frequently accessed entities in memcache to reduce datastore lookups. – Drew Sears Nov 12 '11 at 18:18

4 Answers4

11

Definitely use appstats as Drew suggests; regardless of what library you're using, it will tell you what operations your handlers are doing. The most likely culprits are keys-only queries and count operations.

Nick Johnson
  • 100,655
  • 16
  • 128
  • 198
  • 1
    Spot on here, I have to do a regular sync operation, and I had been getting a total user count on each sync. So I am caching that now, and I can see the difference. Cheers! – Theblacknight Nov 12 '11 at 23:10
9

My advice would be to use AppStats (Python / Java) to profile your traffic and figure out which handler is generating the most datastore ops. If you post the code here we can potentially suggest optimizations.

Drew Sears
  • 12,812
  • 1
  • 32
  • 41
  • I know where most of my traffic is going, and I am using 'Siena', a java library that works with GAE. I will go through my code, and try pick out snippets that might be useful. – Theblacknight Nov 11 '11 at 20:22
  • AppStats is set up, will update my original post when I have more info. Thanks. – Theblacknight Nov 12 '11 at 14:38
1

Don't scan your datastore, use get(key) or get_by_id(id) or get_by_key_name(keyname) as much as you can.

GAE-Web
  • 11
  • 2
1

Do you have lots of ReferenceProperty properties in your models? Accessing them will trigger db.get for each property unless you prefetch them. This would trigger 101 db.get requests.

class Foo(db.Model):
   user = db.ReferenceProperty(User)

foos = Foo.all().fetch(100)
for f in foos:
  print f.user.name  # this triggers db.get(parent=f, key=f.user)
Teemu Ikonen
  • 11,861
  • 4
  • 22
  • 35
  • With the java API I am using I need to manually get each property that is referenced in another entity. Trying to fetch in batches now, to see if that gives me the boost i need. – Theblacknight Nov 12 '11 at 21:19
  • Check this blog entry I wrote, the prefetching part: http://bravenewmethod.wordpress.com/2011/03/23/developing-on-google-app-engine-for-production/ – Teemu Ikonen Nov 13 '11 at 09:05