5

I am experimenting with creating a text index in MongoDB for several fields in a sub-document. These fields are not the same from document to document. According to the documentation, I would create a normal text index like so:

db.collection.ensureIndex({
    subject: "text",
    content: "text"
});

In my case, I want the index on all fields in the fs.files collection at db.fs.files.metadata. I've tried this:

db.fs.files.ensureIndex({'metadata.$**': 'text'});

I don't believe this has worked, as searching with db.fs.files.runCommand('text'... returns zero results, and db.fs.files.stats() shows the index as a very small size (and I have ~35k documents in this collection).

How can I create a text index on field values of a subdocument where the keys are not known ahead of time?

Brad
  • 159,648
  • 54
  • 349
  • 530
  • are you trying to full text search things in GridFS? I'm not sure that's going to work. – tom Oct 08 '13 at 08:56
  • In the documentation that you linked : 'This text index catalogs all string data in the subject field and the content field, where the field value is either a string or an array of string elements. ' so if the type is not string or array of strings will not collect anything. – attish Oct 08 '13 at 11:34
  • @tom I'm trying to do a full text search on the metadata of items in GridFS. I don't see how the `fs.files` collection would be any different than any other collection. (I'm not trying to search the actual data of files in GridFS.) – Brad Oct 08 '13 at 15:15
  • 1
    @attish The very next sentence in the documentation says: *"When creating a text index on multiple fields, you can specify the individual fields or you can wildcard specifier (`$**`)."* So if I'm understanding that correctly, under normal circumstances I could create a text index for all fields... it just isn't clear how to make that work for a subdocument. My data in each element of `db.fs.files.metadata` is in string format. – Brad Oct 08 '13 at 15:18
  • @Brad ah, I see now. This might sound stupid, but have you tried creating a {'$**': 'text'} index on the parent document and seeing if it will index the subdocument for you? The docs say it only affects text so it should skip the file data (but will include the name & contentType I guess). – tom Oct 08 '13 at 15:38
  • 3
    @tom That has appeared to have worked! Now, is there any way to filter a wildcard so it only contains sub-document field? Either way, please post that as an answer so I can accept it. – Brad Oct 09 '13 at 03:28
  • Sorry, I can't see a way of excluding the parent document from the index. Glad it's working though! – tom Oct 09 '13 at 09:31

1 Answers1

4

If you create a {'$**': 'text'} index on the parent document it will index the subdocument fields too. The docs say it only affects text so it will skip the file data but will include the name & contentType.

tom
  • 2,704
  • 16
  • 28