1

First of all I'm sorry if my english is not very good, I hope what I'm going to write will be understandable.

I have a document with this schema:

{
    "fields":[
         {"field": field1, "value": 1},
         {"field": field2, "value": 2},
         {"field": field3, "value": 3}
     ],
     "time": datetimeObj
},
{
    "fields":[
        {"field": field1, "value": 4},
        {"field": field4, "value": 5}
    ],
    "time": datetimeObj
}

What I'm trying to do is to produce a query that returns the list of values of one specific field.

For example if I want the values of the field "field1" I expect something like:

[1, 4]

For field "field2":

[2, 0] // 0 because it doesn't exist in the second document

Now I'm trying to obtain this result using aggregate operator like this:

db.collection.aggregate([
{
  $project: {
     value: {
        $filter: {
           input: "$fields",
           as: "fields",
           cond: { $eq: [ "$$fields.filed", "filed1" ] }
        }
     },
     _id : 0
  }
}

But what I get is much verbose than I'd like to:

{ 
"value" : [
    {
        "value" : NumberInt(1), 
        "filed" : "field1"
    }
]
}
{ 
 "value" : [
    {
        "value" : NumberInt(4), 
        "word" : "field1"
    }
  ]
}

Is there a way to get the result just as an array of values? Also, is it possible to sort these values by the time field?

I'm using the python library, so it would be very usefull to have a python example of the solution. Thank you

s7vr
  • 73,656
  • 11
  • 106
  • 127
F. Aragona
  • 79
  • 7

2 Answers2

0

According to above mentioned description as a solution to it please try executing following aggregate query into MongoDB shell.

db.collection.aggregate(

    // Pipeline
    [
        // Stage 1
        {
            $unwind: {
                path: "$fields"
            }
        },

        // Stage 2
        {
            $group: {
                _id: {
                    fields: '$fields.field',
                    time: '$time'
                },
                value: {
                    $addToSet: '$fields.value'
                },

            }
        },

        // Stage 3
        {
            $match: {
                '_id.fields': 'field1'
            }
        },

        // Stage 4
        {
            $project: {
                "fields": '$_id.fields',
                value: '$value',
                time: '$_id.time',
                _id: 0
            }
        },

        // Stage 5
        {
            $sort: {
                time: 1
            }
        },

    ]



);
Rubin Porwal
  • 3,736
  • 1
  • 23
  • 26
  • It works well! But what if I want to sort the result by the 'time' attribute? Should I add another stage? – F. Aragona Sep 26 '17 at 13:43
  • @F.Aragona I have added sort stage into above aggregate query to sort resultset according to time attribute – Rubin Porwal Sep 26 '17 at 13:51
  • It works but the output is no longer an array of the values, but an array of objects with the projected attributes. Maybe the sorting stage should be performed earlier? – F. Aragona Sep 26 '17 at 14:21
0

You can try the below aggregation query.

$addFields with $cond check if fields array contains ($in) field1 document, if found keep fields array or else create a fields array with single {"field": "field1", "value": 0} document followed by $unwind and $match to keep only field1 document.

$sort by time and $group and $slice to collect values.

 db.collection.aggregate([
  {
    "$addFields": {
      "fields": {
        "$cond": [
          {
            "$in": [
              "field1",
              "$fields.field"
            ]
          },
          "$fields",
          [
            {
              "field": "field1",
              "value": 0
            }
          ]
        ]
      }
    }
  },
  {
    "$unwind": "$fields"
  },
  {
    "$match": {
      "fields.field": "field1"
    }
  },
  {
    "$sort": {
      "time": 1
    }
  },
  {
    "$group": {
      "_id": "null",
      "values": {
        "$push": "$fields.value"
      }
    }
  },
  {
    "$project": {
      "values": {
        "$slice": [
          "$values",
          12
        ]
      }
    }
  }
])
s7vr
  • 73,656
  • 11
  • 106
  • 127
  • I like this solution, but there is an issue to solve. I need to get last 'n' values of a certain field, 'field1' in the example, this is why I need to sort by time desc (just adjusting "time" : -1 ). The problem is that 'field1' could not exist in one of the 'n' documents I'm analysing, like 'field2' in my question, and I need to know that. Maybe I'd to write better my question. Is it possibile to adjust this solution to retrieve e.g. last 12 values (sort by time desc, limit 12) having 0 if the value does not exist? – F. Aragona Sep 26 '17 at 15:04
  • I have adjusted the implementation to create a field 1 and value 0 document when it doesn't exist in `fields` array and added slice at the end to pick 12. Please verify. – s7vr Sep 26 '17 at 15:31
  • It works great! But I've a doubt. Is it performing the execution of the slice only in the end of the pipe? If I have a collection with 1000 documents I would prefer to sort -> slice -> perform other operations just on a subset of my 1000 documents (12 in this case). – F. Aragona Sep 26 '17 at 16:26
  • That is genuine concern but unfortunately this is not possible. See [here](https://jira.mongodb.org/plugins/servlet/mobile#issue/SERVER-9377) and [here](https://stackoverflow.com/questions/24463689/mongodb-aggregation-group-restrict-length-of-array) – s7vr Sep 26 '17 at 16:33