7

I'm using ndb.Model. The Search API has the following field classes:

    TextField : plain text
    HtmlField : HTML formatted text
    AtomField : a string which is treated as a single token
    NumberField : a numeric value (either float or integer)
    DateField : a date with no time component
    GeoField : a locale based on latitude and longitude

Suppose I have a 'tags' field which is a list field:

    tags = ndb.StringProperty(repeated=True)

How am I supposed to treat this field with search.Document?

Right now I'm turning tags list into a string:

    t = '|'.join(tags)

And then:

    search.TextField(name=cls.TAGS, value=t)

Any suggestions?

Nijin Narayanan
  • 2,269
  • 2
  • 27
  • 46
  • A couple of questions. 1. Why use text search for tags, thats suits datastore queries. 2, Why concatenate with '|' why not space character – Tim Hoffman May 07 '13 at 00:12
  • Hey Tim. 1. I want the user to be able to enter one word into the form search field and use it to search through different fields. Suppose he enters 'Carpenter', the results will include 'Carpenter' as a tag (a job, for instance) and 'Carpenter' as last name. 2. I'm concatenating using the pipe because there might be two word tags like 'Professional Reader.' – Richard Haber May 07 '13 at 00:27
  • Datastore is not suitable if you want to perform unions or intersections of tags. – moraes May 09 '13 at 05:32

2 Answers2

7

You should add as many fields as 'tags' you have, all with the same field_name:

doc = search.Document(fields=[
    search.TextField(name='tag', value=t) for t in tags
])

As in the docs:

A field can contain only one value, which must match the field's type. Field names do not have to be unique. A document can have multiple fields with the same name and same type, which is a way to represent a field with multiple values. (However, date and number fields with the same name can't be repeated.) A document can also contain multiple fields with the same name and different field types.

  • 2
    This is indeed the recommended approach. NOTE: the admin console doesn't currently show multiple fields with the same name... so it'll appear that only the "last field" was added... when in fact they are all there – Nicholas Franceschina May 03 '14 at 19:56
6

Use unique identifiers for each "tag". Then you can create a document like:

doc = search.Document(fields=[
    search.TextField(name='tags', value='tag1 tag2 tag3'),
])
search.Index(name='tags').put(doc)

You can even use numbers (ids) as strings:

doc = search.Document(fields=[
    search.TextField(name='tags', value='123 456 789'),
])

And query using operators as you wish:

index = search.Index(name='tags')
results = index.search('tags:(("tag1" AND "tag2") OR ("tag3" AND "tag4"))')
moraes
  • 13,213
  • 7
  • 45
  • 59