1

I have performed the following query:

db.Indiv2.aggregate(
{$unwind: '$conso'},
{$group: {_id: {conso:'$conso.nom_commercial', region:"$region"}, sum: 
{$sum: 1}}},
{$sort : {sum : -1}},
{$group: {
_id: "$_id.region",
"conso": {
    $first: "$_id.conso"
},
"sum": {
    $first: "$sum"
},
}},
{$sort : {_id : 1}}
);

which returns the highest consumed food by region in the following format:

    {
        "_id" : {
            "conso" : "x",
            "region" : 1
        },
        "sum" : 73226.0
    },
    {
        "_id" : {
            "conso" : "x",
            "region" : 8
        },
        "sum" : 25683.0
    },
    {
        "_id" : {
            "conso" : "grandlait demi �cr�m� uht",
            "region" : 1
        },
        "sum" : 251.0
    }

However, a lot of foods do not have names. Such items are named "x" (example above). I would like to filter my query to exclude such items. I am looking for an equivalent of a filter which can either filter strings which length are below 2 characters or filter the string "x".

micoco
  • 279
  • 1
  • 4
  • 16
  • Can you use the [edit] link to show some sample documents from `Indiv2` collection and your expected output? – chridam Nov 13 '17 at 17:47

2 Answers2

2

There is a $redact operator which might be helpful here:

Try:

db.collection.aggregate([
    //previous aggregations
    {
        "$redact": {
            "$cond": [
                { "$gt": [ { "$strLenCP": "$_id.conso" }, 2] },
                "$$KEEP",
                "$$PRUNE"
            ]
        }
    }
])
mickl
  • 48,568
  • 9
  • 60
  • 89
0

You could just capture the conso length and filter. Given the or condition described above, we end up with this:

// previous pipeline here, then:
{$addFields: {slen: {$strLenBytes: "$_id.conso"}}}
,{$match: {$or: [ {"_id.conso": {$ne: "x"}}, {slen: {$gt: 1}} ] }}
Buzz Moschetti
  • 7,057
  • 3
  • 23
  • 33