Performing a diacritics-insensitive $regex search in MongoDB

Question

as the title says, I am trying to figure out how to perform a diacritics-insensitive $regex search in MongoDB, although at this point I am not sure if that's even possible.

Basically, imagine we have a Teams collection with a documents like these:

{ id: 1, name: "FC Bayern München" },
{ id: 2, name: "Atlético Madrid" }

For this collection, I have created a text index for the name field:

db.getCollection('teams').createIndex({name: 'text'});

This allows me to perform a diacritics and case-insensitive search.

db.getCollection('teams').find({ $text: { $search: "bayern" }});
db.getCollection('teams').find({ $text: { $search: "munchen" }});
// ✅ { id: 1, name: "FC Bayern München" }

However, if the text search doesn't include a full word (Bayern, Munchen), the query produces no results:

db.getCollection('teams').find({ $text: { $search: "bayer" }});
db.getCollection('teams').find({ $text: { $search: "munc" }});
// ❌ (no results)

So to make this work as intended, I need to use $regex search instead, however, I can't seem to find a way to ignore diacritics.

db.getCollection('teams').find({ name: { $regex: "baye", $options: 'i' }});
// ✅ { id: 1, name: "FC Bayern München" }

db.getCollection('teams').find({ name: { $regex: "munchen", $options: 'i' }});
// ❌ (no results)

So my question is, is there any way to achieve this universal search that can search both diacritics-insensitively and not having to match whole words, via regular expression or other means?

@turivishal I know, that's the entire point. I want to be able to perform a search that would match both `u` and `ü` universally (diacritics). — decho, Feb 13 '22 at 08:44
@AlexTotolici I haven't really looked much into it since, but I have been loosely following MongoDB updates and I haven't heard or read anything about it. — decho, Dec 08 '22 at 08:15
there is a workaround in this question https://stackoverflow.com/questions/36647244/mongodb-how-to-find-documents-ignoring-case-sensitive-accents-and-percent-like because mongoo regex dont support u option https://www.mongodb.com/docs/manual/reference/operator/query/regex/ — boly38, Jan 11 '23 at 12:30

Performing a diacritics-insensitive $regex search in MongoDB

0 Answers0