1

So I have some user documents that have first names and names. I'd like to offer the possibility to filter them if a given input "looks like" that fullname.

user: {
   firstName: "foo",
   lastName: "name",
   ...
}

Before, I was doing this filter on the javascript side by measuring a "score" of likeness between the input and the fullname, but since I implemented a pagination system I cannot do this anymore.

Now I'd like to find a way to make this on the aggregation-side.

I've found this question and this one, that seems to make the same thing but since they do not use aggregation directly it's difficult to me to link this to my case.

A way of doing that I have been thinking is to use the $where operator in the aggregation and then addField to have the "likness score" with each result of the query, then filter them on their score using the aggregatePaginate options.

But it seems to me I cannot use $where in the aggregation. Any idea ?

Many thanks ! Kev :)

Kevin Heirich
  • 109
  • 2
  • 12

1 Answers1

1

For aggregation you can use $function to execute custom code. However that kind of defeats the purpose of pagination as Mongo has to execute this code for each document, additionally it is recommended not to execute js in Mongo as it has somewhat bad performance.

It seems like you're just using the wrong database for the job, If this is a crucial feature I recommend trying a text search db like atlas search or elasticsearch

If changing db is not an option, and this feature is too important to remove or alternate. maybe if your provide additional detail as to how the scoring works it'll be easier to give a full answer.

Tom Slabbaert
  • 21,288
  • 10
  • 30
  • 43
  • Hey, Yes in our case mongoDB is the wrong database but sadly I've been asking to change this technology for months and management wont allow us this time... So I have to do with it. The scoring is actually very simple in the current way, we juste increment a counter each time a character from the input is found in the full name. But this is not a fixed approach, we may change this to find a better solution if needed, maybe the `$regex` operator should help but i'm not very sure how because to me it will need to have an exact pattern ("ohn" will find "john" but not "jonn") – Kevin Heirich Dec 07 '21 at 12:03
  • Maybe this approach would be okay if it doesn't break the performance gain on the pagination system, and i think it wouldn't – Kevin Heirich Dec 07 '21 at 12:04
  • So basically `Ella` and `Allen` will have a very close score because they share 4 letters? order does not matter? – Tom Slabbaert Dec 07 '21 at 12:06
  • Yes exactly, that's very simple and not very well made but for now it makes the job ^^ – Kevin Heirich Dec 07 '21 at 12:10
  • But again I would be open to other approach that allows us to keep the pagination performance gain – Kevin Heirich Dec 07 '21 at 12:11
  • I guess under all these constraints execute js is the best way to go, however without changing a db this means you need to execute it on the entire collection every single time. if the size and scale of your collections can handle it that's fine. – Tom Slabbaert Dec 07 '21 at 12:20
  • Well then i'll get back to the team with this approach to see if we really want to implement that feature :) Many thanks for your help ! – Kevin Heirich Dec 07 '21 at 12:29