0

Can anyone tell me how to add a $match stage to an aggregation pipeline to filter for where a field MATCHES a query, (and may have other data in it too), rather than limiting results to entries where the field EQUALS the query?

The query specification...

var query = {hello:"world"};

...can be used to retrieve the following documents using the find() operation of MongoDb's native node driver, where the query 'map' is interpreted as a match...

{hello:"world"}
{hello:"world", extra:"data"}

...like...

collection.find(query);

The same query map can also be interpreted as a match when used with $elemMatch to retrieve documents with matching entries contained in arrays like these documents...

{
  greetings:[
    {hello:"world"},
  ]
}

{
  greetings:[
    {hello:"world", extra:"data"},
  ]
}

{
  greetings:[
    {hello:"world"},
    {aloha:"mars"},
  ]
}

...using an invocation like [PIPELINE1] ...

collection.aggregate([
  {$match:{greetings:{$elemMatch:query}}},
]).toArray()

However, trying to get a list of the matching greetings with unwind [PIPELINE2] ...

collection.aggregate([
  {$match:{greetings:{$elemMatch:query}}},
  {$unwind:"$greetings"},
]).toArray()

...produces all the array entries inside the documents with any matching entries, including the entries which don't match (simplified result)...

[
  {greetings:{hello:"world"}},
  {greetings:{hello:"world", extra:"data"}},
  {greetings:{hello:"world"}},
  {greetings:{aloha:"mars"}},
]

I have been trying to add a second match stage, but I was surprised to find that it limited results only to those where the greetings field EQUALS the query, rather than where it MATCHES the query [PIPELINE3].

collection.aggregate([
  {$match:{greetings:{$elemMatch:query}}},
  {$unwind:"$greetings"},
  {$match:{greetings:query}},
]).toArray()

Unfortunately PIPELINE3 produces only the following entries, excluding the matching hello world entry with the extra:"data", since that entry is not strictly 'equal' to the query (simplified result)...

[
  {greetings:{hello:"world"}},
  {greetings:{hello:"world"}},
]

...where what I need as the result is rather...

[
  {greetings:{hello:"world"}},
  {greetings:{hello:"world"}},
  {greetings:{"hello":"world","extra":"data"}
]

How can I add a second $match stage to PIPELINE2, to filter for where the greetings field MATCHES the query, (and may have other data in it too), rather than limiting results to entries where the greetings field EQUALS the query?

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
cefn
  • 2,895
  • 19
  • 28

1 Answers1

0

What you're seeing in the results is correct. Your approach is a bit wrong. If you want the results you're expecting, then you should use this approach:

collection.aggregate([
  {$match:{greetings:{$elemMatch:query}}},
  {$unwind:"$greetings"},
  {$match:{"greetings.hello":"world"}},
]).toArray()

With this, you should get the following output:

[
  {greetings:{hello:"world"}},
  {greetings:{hello:"world"}},
  {greetings:{"hello":"world","extra":"data"}
]

Whenever you're using aggregation in MongoDB and want to create an aggregation pipeline that yields documents you expect, you should always start your query with the first stage. And then eventually add stages to monitor the outputs from subsequent stages.

The output of your $unwind stage would be:

[{
  greetings:{hello:"world"}
},
{
  greetings:{hello:"world", extra:"data"}
},
{
  greetings:{hello:"world"}
},
{
  greetings:{aloha:"mars"}
}]

Now if we include the third stage that you used, then it would match for greetings key that have a value {hello:"world"} and with that exact value, it would find only two documents in the pipeline. So you would only be getting:

{ "greetings" : { "hello" : "world" } }
{ "greetings" : { "hello" : "world" } }
SiddAjmera
  • 38,129
  • 5
  • 72
  • 110
  • You've confirmed what I feared. As stated I was aware that the final stage was being interpreted as a test of equality. I know I can manually unpack all the fields into complete paths and values as equality tests, but that's exactly what I didn't want to do, as my 'query' can be anything. The problem is that although the last matching stage is identical in filtering logic to the first stage, MongoDb demands two different query syntaxes. A workaround; create the routine to unpack a match query into path equalities which MongoDb should have provided in the first place. – cefn Apr 11 '16 at 09:56