1

MongoDB recommends using indexes with collation with locale + strength 1 or 2 to search with case insensitive.

This seems to work fine when using an equals filter, but not when using a Contains(text) filter.

So, for example, if I have a collection of TDocument I can create an index like this:

//reusing collation to make sure I always use the same
public static class CollationOptions
{
    public static readonly Collation CaseInsensitiveCollation = new Collation("en", strength: CollationStrength.Primary);
}

var indexOptions =
    new CreateIndexOptions
    {
        Collation = CollationOptions.CaseInsensitiveCollation
    };

Expression<Func<TDocument, object>> myIndex = x => x.Name;

var combined = 
    new List<IndexKeysDefinition<TDocument>> 
    {
        // I could combine or have just one
        Builders<TDocument>.IndexKeys.Ascending(myIndex);
    }

var indexDefinition = Builders<TDocument>.IndexKeys.Combine(combined);
var createIndexModel = new CreateIndexModel<TDocument>(indexDefinition, indexOptions);
var collection = _magicRetrieverService.GetCollection<TDocument>(); // it's a standard IMongoCollection<TDocument>
await collection.Indexes.CreateOneAsync(createIndexModel);

Then, when searching like this:

Expression<Func<Document, bool>> filter = x => x.Name == "foo";
var findOptions =
    new FindOptions<TDocument,TDocument>
    {
        AllowPartialResults = false,
        //Sort = sortDefinition, // irrelevant
        Limit = 100,
        Collation = CollationOptions.CaseInsensitiveCollation
    };
var queryResult = await collection.FindAsync(filter, findOptions, cancellationToken);
var results = await queryResult.ToListAsync(cancellationToken: cancellationToken);

Then the query finds all documents that has the name foo, Foo, FOO, fOO.

However, If I modify the filter to use Contains so that it can also find a document with name this is FOO also, the insensitive case search does not work as expected. It only matches the documents with name containing exactly the text: foo.

Expression<Func<Document, bool>> filter = x => x.Name.Contains("foo");

Why isn't collation in index and find method working as expected and returning all the documents that contain foo independently of the case?

I'm using MongoDB.Driver 2.12.4 and mongo:4.0 running as a container

docker run --name mongodb -e MONGO_INITDB_ROOT_USERNAME=root -e MONGO_INITDB_ROOT_PASSWORD=dummy -p 27017:27017 -d mongo:4.0

Of course, the following filter works fine, but I thought I could deal with case insensitive searches without having to explicitly convert anything.

Expression<Func<Document, bool>> filter x => x.Name.ToLower().Contains("foo".ToLower());

Related questions but not the same:

Manually supplying arguments to a MongoDB query to support collation feature (for case insensitive index)

MongoDB and C#: Case insensitive search

diegosasw
  • 13,734
  • 16
  • 95
  • 159
  • Which MQL operator does Contains map to? – D. SM Jun 29 '21 at 17:08
  • 1
    the linq `Contains` in the c# driver gets translated to `$regex` operator. unfortunately regex operator in mongodb is not collation aware. so, you can't use `Contains` to achieve your requirement. – Dĵ ΝιΓΞΗΛψΚ Jun 30 '21 at 04:13

0 Answers0