0

How do we implement agregation or composition with NDB on Google App Engine ? What is the best way to proceed depending on use cases ? Thanks !

I've tried to use a repeated property. In this very simple example, a Project have a list of Tag keys (I have chosen to code it this way instead of using StructuredProperty because many Project objects can share Tag objects).

class Project(ndb.Model):
    name = ndb.StringProperty()
    tags = ndb.KeyProperty(kind=Tag, repeated=True)
    budget = ndb.FloatProperty()
    date_begin = ndb.DateProperty(auto_now_add=True)
    date_end = ndb.DateProperty(auto_now_add=True)

    @classmethod
    def all(cls):
        return cls.query()

    @classmethod
    def addTags(cls, from_str):
        tagname_list = from_str.split(',')
        tag_list = []
        for tag in tagname_list:
            tag_list.append(Tag.addTag(tag))
        cls.tags = tag_list

--

Edited (2) : Thanks. Finally, I have chosen to create a new Model class 'Relation' representing a relation between two entities. It's more an association, I confess that my first design was unadapted.

Hicham
  • 17
  • 3
  • I'd reference this question: http://stackoverflow.com/questions/13930573/ndb-modelling-one-to-many-merits-of-repeated-keyproperty-vs-foreign-key – deweyredman May 12 '14 at 22:45

2 Answers2

1

An alternative would be to use BigQuery. At first we used NDB, with a RawModel which stores individual, non-aggregated records, and an AggregateModel, which a stores the aggregate values.

The AggregateModel was updated every time a RawModel was created, which caused some inconsistency issues. In hindsight, properly using parent/ancestor keys as Tim suggested would've worked, but in the end we found BigQuery much more pleasant and intuitive to work with.

We just have cronjobs that run everyday to push RawModel to BigQuery and another to create the AggregateModel records with data fetched from BigQuery.

(Of course, this is only effective if you have lots of data to aggregate)

john2x
  • 22,546
  • 16
  • 57
  • 95
  • Thanks. But I don't think I need to use BigQuery for the moment as I don't have lots of data to agregate. On the other hand, the fact that there is an undetermined number of data to agregate, I have chosen to implement a Model class describing the relation between two entities in order to make search faster. Indeed, I think that looking for all Relation objects to check it's left member (Tag in this case) for example is faster than to look at all Projects and then to check every agregated tag in this object. – Hicham May 14 '14 at 23:25
0

It really does depend on the use case. For small numbers of items StructuredProperty and repeated properties may well be the best fit.

For large numbers of entities you will then look at setting the parent/ancestor in the Key for composition, and have a KeyProperty pointing to the primary entity in a many to one aggregation.

However the choice will also depend heavily on the actual use pattern as well. Then considerations of efficiency kick in.

The best I can suggest is consider carefully how you plan to use these relationships, how active are they (ie are they constantly changing, adding, deleting), do you need to see all members of the relation most of the time, or just subsets. These consideration may well require adjustments to the approach.

Tim Hoffman
  • 12,976
  • 1
  • 17
  • 29