0

I'm new to MongoDB but here goes:

Is there a way to do a form of partial search in MongoDB?

For example, if the document is something like { Name: Michael A. Johnson}, is there a way to format search so that it will return that document for a query of Name: Michael Johnson and vice versa?

Additionally, is there a method to search 'Name: Johnson' and return the document { Name: Michael A. Johnson}?

Thanks!

Community
  • 1
  • 1
techalicious
  • 443
  • 1
  • 6
  • 14
  • Yea [`$regex`](http://docs.mongodb.org/manual/reference/operator/query/regex/) or [`$text`](http://docs.mongodb.org/manual/reference/operator/query/text/) – zamnuts Jun 18 '14 at 22:33
  • use regex, check this http://docs.mongodb.org/manual/reference/operator/query/regex/ – Saheed Hussain Jun 19 '14 at 03:18

1 Answers1

2

As of MongoDB 2.6, $text (in conjunction with $search and $meta) can provide the search term functionality you describe.

Consider the following:

db.collection.ensureIndex({Name:'text'});
db.collection.find({
        $text: { $search: 'Michael Johnson' }
    },{
        score: { $meta: 'textScore' }
}).sort({
        score: {  $meta: 'textScore' }
});

Note that you do not need to ensureIndex all the time, the index will update as necessary. Also, all relevant indices will be used, so if you have multiple text-type indices, these will also be considered.

Per the documentation for $text:

$text performs a text search on the content of the fields indexed with a text index. $search (string) A string of terms that MongoDB parses and uses to query the text index. MongoDB performs a logical OR search of the terms unless specified as a phrase.

If you'd like to sort the result based on relevance (which is what is happening in the example above) use the meta textScore property via $meta (don't forget to duplicate in sort):

The $text operator assigns a score to each document that contains the search term in the indexed fields. The score represents the relevance of a document to a given text search query. The score can be part of a sort() method specification as well as part of the projection expression. The { $meta: "textScore" } expression provides information on the processing of the $text operation.


$text will not work on multiple fields individually. In this case, use $regex:

{ field: { $regex: '\bWORD\b', $options: 'i' } }

How to write regular expressions is out of scope. Do some searching on SO.

To mimic the behavior of $text where all "words" in the subject string are space-delimited "terms," you can create an array of regular expression objects by splitting on ' ' and mapping each term to a $regex object. If this is user-input then it is important to also escape all meta characters that could be considered part of the regular expression. Finally, construct an $or expression featuring all the subjects you want to search for, or alternatively an $and, $not, etc...

Here is a full example implementation with $or (logical OR):

var nameMongoSearch = strToMongoRegexArray('Michael Johnson','Name');
var almaMaterMongoSearch = strToMongoRegexArray('KU','AlmaMater');

// OR matching for both Name and AlmaMater terms
db.collection.find({
    $or: [].concat(nameMongoSearch).concat(almaMaterMongoSearch)
});

/*
 * When str = "Michael Johnson" and key = "Name"
 * convert to something like
 * [
 *   { Name: { $regex: '\\bMichael\\b', $options: 'i' } },
 *   { Name: { $regex: '\\bJohnson\\b', $options: 'i' } }
 * ]
 */
function strToMongoRegexArray(str,key) {
    // 
    return str
    .split(' ') // translate String to Array, split into "terms"
    .filter(Boolean) // filter empty strings (in the case of adjecent spaces)
    .map(function(str){ // translate each term into a mongodb regex
        var o = {};
        o[key] = {
            $regex: '\\b'+escapeRegExp(str)+'\\b', // the '\\b' encapsulation is for word boundaries
            $options: 'i' // the 'i' flag is for case insensitive matching
        };
        return o;
    });
}

/*
 * from https://stackoverflow.com/a/6969486/1481489
 * this will escape regex metacharacters for javascript for user input sanitation
 */
function escapeRegExp(str) {
  return str.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, "\\$&");
}

If you want to logical AND, this replacement snippet could work:

db.collection.find({
    $and: [
        { $or: nameMongoSearch},
        { $or: almaMaterMongoSearch}
    ]
});

Note: By convention, field names are usually camelcase and begin with a lowercase letter, i.e. the field is "almaMater" and not "Alma Mater" nor "AlmaMater". But to keep aligned with your original question, I am maintaining the cap on the first letter.

Community
  • 1
  • 1
zamnuts
  • 9,492
  • 3
  • 39
  • 46
  • Thanks so much for your quick response! Your answer is based on creating an index, in this case, for Name right? What if there is another field (for example, document with {Name: 'Michael A. Johnson', Alma Mater: KU})? Is there a method to create multiple indices so that you could in one moment search name, and in another moment search alma mater? – techalicious Jun 18 '14 at 23:09