Most convenient way to update multiple instances on google app engine datastore with ndb

Question

I need to update a number of instances of class Foo on Google App Engine Datastore using ndb.

Here's what I have so far:

while more:
    foo_instances, more_cursor, more = Foo.query().fetch_page(
        20, start_cursor=more_cursor)
    for foo in foo_instances:
        bar = foo.bar.get()  # foo.bar is a Key to a Bar instance.
        bar.updated = True

    ndb.put_multi(foo_instances)

and (tasklet friendly):

foo_iterator = Foo.query().iter()
while (yield foo_iterator.has_next_async()):
    foo = foo_iterator.next()
    bar = foo.bar.get()  # foo.bar is a Key to a Bar instance.
    bar.updated = True

    yield bar.put_async()

I'm planning to execute this code in a Push Queue task which I believe to have a 10 minute window before timing out.

Which one is the correct approach (if any) to execute the task and avoid timeout or memory issues? There are a few thousands of instances of type Foo.

side note: in the 1st solution you probably want to track and `put_multi` the `bar` instances, not the `foo_instances`, right? — Dan Cornilescu, Feb 17 '17 at 23:04

score 0 · Answer 1 · edited May 23 '17 at 10:29

If you plan to use the push queue why not split the work in smaller pieces (by your cursor size, for example) and have each piece handled by a different task? This way you shouldn't have scalability issues and thus be free to pick whichever solution you desire/prefer.

Something along the lines of the solution discussed in Google appengine: Task queue performance (but replace the deferred library with the push queue).

Most convenient way to update multiple instances on google app engine datastore with ndb

1 Answers1