5

I have a MongoDb collection that holds data of thousands of tenants which are separated by a field holding the tenant ID.
I want to perform a query that includes full-text search & faceting on a specific tenant (see below).
Since the $searchBeta must be the first phase in the aggregation pipeline, it means it will return a huge list of documents from all tenants just to filter them out on $match phase instead of using an index.
Is there a more efficient way to perform such a query?
Running it with $searchBeta takes 14 seconds while removing it reduces query time to 6ms (while faceting works on more documents since it includes all tenant's data).
* Using Doug's answer bellow reduce it to ~70-120ms.
* Time measurement was done on Atlas free tier so it doesn't mean too much

db.tenantData.aggregate( [
{
    $searchBeta: {
        "search": {
        "query": "test",
        "path": ["name","comment"],
      }
    }
},
{
    $match: {tenant:"tenant1"},
},
{
    $facet: {
        "AssetTypeFacet": [
          { $sortByCount: "$assetType" }
        ],
        "manufacturerFacet": [
          { $sortByCount: "$manufacturer" }
        ]
    }
}
])    
Avner Levy
  • 6,601
  • 9
  • 53
  • 92

1 Answers1

2

You could combine the query into a compound like so:

db.tenantData.aggregate([
  {
    $searchBeta: {
      compound: {
        should: {
          search: { query: "test", path: ["name", "comment"] },
        },
        filter: {
          search: { query: "tenant1", path: "tenant" },
        },
      },
    },
  },
  {
    $facet: {
      AssetTypeFacet: [{ $sortByCount: "$assetType" }],
      manufacturerFacet: [{ $sortByCount: "$manufacturer" }],
    },
  },
]);

This will eliminate the IO associated with $match. There is currently no way to do faceting directly in $searchBeta other than the way you've done it.

Radosław Miernik
  • 4,004
  • 8
  • 33
  • 36
Doug
  • 14,387
  • 17
  • 74
  • 104