1

Here is my attempt at performing the aggregation (day-wise) based on timestamp if all the elements are hardcoded inside the query.

pipe = [ 
{
     "$match": { 
        "cid": ObjectId("57fe39972b8dbc1387b20913")
        }
    },
{
    "$project":
    {

        "animal_dog": "$animal.dog",
        "animal_dog_tail": "$animal.dog.tail",
        "animal_cat": "$animal.cat",
        "tree": "$fruits",
        "day": {"$substr": ["$timestamp",  0, 10]} 
        }},
{ 
"$group":
    {
    "_id" : "$day",
    "animal_dog" : {"$sum": "$animal_dog"},
    "animal_dog_tail": {"$sum": "$animal_dog_tail"}, 
    "animal_cat": {"$sum": "$animal_cat"}, 
    "tree": {"$sum": "$tree"}, 
    "fruits": {"$sum": "$fruits"},

}} ]

output = dailycollection.aggregate(pipeline = pipe)

Assuming that I have a mongo-collection having the exact same nested structure, how do I pass a python_list with the respective elements for aggregating based on timestamp? Let's say my Python list has elements like this:

key_list = animal.dog, animal.dog.tail, animal.cat, tree, fruits, timestamp.

I would like to pass this list into the query I just wrote above without hardcoding each of the elements. I would like to perform projection, $sum, $group for the elements without hardcoding them as I did in the aforementioned query. I would like to simply iterate through the python list during $project and $group stage.

Is that possible?

And also how do I ensure that the output query also preserves the same nested-format without losing depth?

elevenmac
  • 23
  • 4

1 Answers1

0

You could try the following:

key_list =  ["animal.dog", "animal.dog.tail", "animal.cat", "tree", "fruits", "timestamp"]
match = { "$match": { "cid": ObjectId("57fe39972b8dbc1387b20913") } }
project = { "$project": {} }
group = { "$group": {} }

for item in key_list:
    if item == "timestamp":
        project["$project"]["day"] = { "$substr": ["$"+item,  0, 10] }
        group["$group"]["_id"] = "$day"
        break
    sum = {"$sum": ""}
    sum["$sum"] = "$"+item.replace(".", "_")
    project["$project"][item.replace(".", "_")] = "$"+item
    group["$group"][item.replace(".", "_")] = sum

pipeline = [match, project, group]
chridam
  • 100,957
  • 23
  • 236
  • 235
  • Thanks @chridam. I ran this code and I'm getting this error `pymongo.errors.OperationFailure: exception: invalid operator '$sum'` - I'm not in a position to upgrade Mongo-server, could you please help? – elevenmac Jan 09 '18 at 14:38
  • My mongo-shell version is 3.0.12. – elevenmac Jan 09 '18 at 14:44
  • When you `print(pipeline)` what do you get? – chridam Jan 09 '18 at 15:14
  • Thanks mate, it was my bad, had an extra _id field that caused the error. – elevenmac Jan 09 '18 at 17:58
  • Also regarding the output of the query, how can I ensure that the output is in nested format? Since we have dot notation, how can we automatically add nested curly braces for the resultant output query? – elevenmac Jan 09 '18 at 17:59
  • Using the same list, how do I filter the same query by applying a date-range filter? Say, I'd like to aggregate only those documents having timestamps between X date and Y date? @chridam – elevenmac Jan 09 '18 at 18:46
  • @elevenmac https://stackoverflow.com/questions/2943222/find-objects-between-two-dates-mongodb – chridam Jan 09 '18 at 20:47
  • Thanks, and can you please help me with the nested query output? I would like them to be nested based on the dot notation! @chridam – elevenmac Jan 10 '18 at 05:35
  • Can you please post this in a new question instead? https://meta.stackexchange.com/questions/43478/ – chridam Jan 10 '18 at 13:06