4

I have the following kind of json in my database:

{ 
    "_id" : "519817e508a16b447c00020e", "keyword" : "Just an example query", 
    "results" : 
    {
        "1" : {"base_domain" : "example1.com", "href" : "http://www.example1.com/"},
        "2" : { "base_domain" : "example2.com", "href" : "http://www.example2.com/"},
        "3" : { "base_domain" : "example3.com", "href" : "http://www.example3.com/"},
        "4" : { "base_domain" : "example4.com", "href" : "http://www.example4.com/"},
        "5" : { "base_domain" : "example5.com", "href" : "http://www.example5.com/"},
        "6" : { "base_domain" : "example6.com", "href" : "http://www.example6.com/"},
        "7" : { "base_domain" : "example7.com", "href" : "http://www.example7.com/"},
        "8" : { "base_domain" : "example8.com", "href" : "http://www.example8.com/"},
        "9" : { "base_domain" : "example9.com", "href" : "http://www.example9.com/"},
        "10" : { "base_domain" : "example10.com", "href" : "http://www.example10.com/"}
    } 
}

My goal is to have results for the following query:

> db.ranking.find({ $text: { $search: "http://www.example9.com"}})

It works when I create an index on all text fields

> db.ranking.ensureIndex({ "$**": "text" }))

But not when I create an index only over "results" field:

> db.ranking.ensureIndex( {"results" : "text"} )

Why?

Antoine Brunel
  • 1,065
  • 2
  • 14
  • 30

1 Answers1

4

The problem is that "results" is not a field, it's a sub-document. The syntax for creating an index on text fields for MongoDB requires either the notation for all fields, "$*", which you are using correctly, or a list of all text fields:

Create a text Index

You can create a text index on the field or fields whose value is a string or an array of string elements. When creating a text index on multiple fields, you can specify the individual fields or you can use wildcard specifier ($**).

http://docs.mongodb.org/manual/tutorial/create-text-index-on-multiple-fields/

In your case that would look like:

db.ranking.ensureIndex(
                           {
                             "keyword": "text",
                             "results.1.href": "text",
                             "results.1.href": "text",
                             "results.2.href": "text",
                             "results.3.href": "text",
                             "results.4.href": "text",
                             "results.5.href": "text",
                             "results.6.href": "text",
                             "results.7.href": "text",
                             "results.8.href": "text",
                             "results.9.href": "text",
                             "results.10.href": "text"
                           }
                       )
John Petrone
  • 26,943
  • 6
  • 63
  • 68
  • 1
    Presumably this could be made a lot more efficient if results were an array. – John Powell Aug 15 '14 at 15:45
  • Very interesting! Indexing the first href also worked for the rest of them: `db.ranking.ensureIndex({"results.1.href": "text"})` Now, is there a way to retrieve directly the key and the value without looping through all the dictionary? @JohnBarça Yes, I wish it were an array! – Antoine Brunel Aug 15 '14 at 18:58
  • 1
    @JohnBarça You mentioned using an array. Do you mean changing the db structure for this: `{ "_id" : "519817e508a16b447c00020e", "keyword" : "Just an example query", "results" : [{"base_domain" : "example1.com", "href" : "http://www.example1.com/"},{"base_domain" : "example2.com", "href" : "http://www.example2.com/"},...}]` The index would be hashed or text too? – Antoine Brunel Aug 16 '14 at 00:18
  • @JohnBarça Any additional hints? – Antoine Brunel Aug 16 '14 at 00:25
  • I just asked another question since I ran into an [Index key pattern too large error with code 67](http://stackoverflow.com/questions/25336130/mongodb-index-key-pattern-too-large-when-indexing-text-fields) because I have 100 results and not 10! – Antoine Brunel Aug 16 '14 at 01:26