2

i have a mongo sharded cluster where i save data from a virtual machines monitoring system (zabbix ecc). Now I want to get some information from the db, for example the avg memfree in the last 2 days of one vm. I read the tutorials about aggregation and also the comparison with sql, but i don't understand how to query time series documents (written like mongo suggests on webinars).

Example: i have a collection with many of these docs (one doc represents 1 hour):

"_id" : ObjectId("5558ab960e8956234285de14"),
    "timestamp_hour" : ISODate("2014-10-13T23:00:00.000+02:00"),
    "name" : "memfree",
    "unity" : "B",
    "values" : {
        "0" : {
            "0" : 2041004032.0000000000000000,
            "1" : 2041004032.0000000000000000,
            "2" : 2041004032.0000000000000000,
            "3" : 2041004032.0000000000000000,
            "4" : 2041004032.0000000000000000,
            "5" : 2041004032.0000000000000000,
            "6" : 2041004032.0000000000000000,
            "7" : 2041004032.0000000000000000,
            "8" : 2041004032.0000000000000000,
            "9" : 2041004032.0000000000000000,
            "10" : 2041004032.0000000000000000,
            "11" : 2041004032.0000000000000000,
            "12" : 2041004032.0000000000000000,
            "13" : 2041004032.0000000000000000,
            "14" : 2041004032.0000000000000000,
            "15" : 2041004032.0000000000000000,
            "16" : 2041004032.0000000000000000,
            "17" : 2041004032.0000000000000000,
            "18" : 2041004032.0000000000000000,
            "19" : 2041004032.0000000000000000,
            "20" : 2041004032.0000000000000000,
            "21" : 2041004032.0000000000000000,
            "22" : 2041004032.0000000000000000,
            "23" : 2041004032.0000000000000000,
            "24" : 2041004032.0000000000000000,
            "25" : 2041004032.0000000000000000,
            "26" : 2041004032.0000000000000000,
            "27" : 2041004032.0000000000000000,
            "28" : 2041004032.0000000000000000,
            "29" : 2041004032.0000000000000000,
            "30" : 2041004032.0000000000000000,
            "31" : 2041004032.0000000000000000,
            "32" : 2041004032.0000000000000000,
            "33" : 2041004032.0000000000000000,
            "34" : 2041004032.0000000000000000,
            "35" : 2041004032.0000000000000000,
            "36" : 2041004032.0000000000000000,
            "37" : 2041004032.0000000000000000,
            "38" : 2041004032.0000000000000000,
            "39" : 2041004032.0000000000000000,
            "40" : 2041004032.0000000000000000,
            "41" : 2041004032.0000000000000000,
            "42" : 2041004032.0000000000000000,
            "43" : 2041004032.0000000000000000,
            "44" : 2041004032.0000000000000000,
            "45" : 2041004032.0000000000000000,
            "46" : 2041004032.0000000000000000,
            "47" : 2041004032.0000000000000000,
            "48" : 2041004032.0000000000000000,
            "49" : 2041004032.0000000000000000,
            "50" : 2041004032.0000000000000000,
            "51" : 2041004032.0000000000000000,
            "52" : 2041004032.0000000000000000,
            "53" : 2041004032.0000000000000000,
            "54" : 2041004032.0000000000000000,
            "55" : 2041004032.0000000000000000,
            "56" : 2041004032.0000000000000000,
            "57" : 2041004032.0000000000000000,
            "58" : 2041004032.0000000000000000,
            "59" : 2041004032.0000000000000000
        },
        "1" : {
            "0" : 2041004032.0000000000000000,
            "1" : 2041004032.0000000000000000,
            "2" : 2041004032.0000000000000000,
            "3" : 2041004032.0000000000000000,
            "4" : 2041004032.0000000000000000,
            "5" : 2041004032.0000000000000000,
            "6" : 2041004032.0000000000000000,
            "7" : 2041004032.0000000000000000,
            "8" : 2041004032.0000000000000000,
            "9" : 2041004032.0000000000000000,
            "10" : 2041004032.0000000000000000,
            "11" : 2041004032.0000000000000000,
            "12" : 2041004032.0000000000000000,
            "13" : 2041004032.0000000000000000,
            "14" : 2041004032.0000000000000000,
            "15" : 2041004032.0000000000000000,
            "16" : 2041004032.0000000000000000,
            "17" : 2041004032.0000000000000000,
            "18" : 2041004032.0000000000000000,
            "19" : 2041004032.0000000000000000,
            "20" : 2041004032.0000000000000000,
            "21" : 2041004032.0000000000000000,
            "22" : 2041004032.0000000000000000,
            "23" : 2041004032.0000000000000000,
            "24" : 2041004032.0000000000000000,
            "25" : 2041004032.0000000000000000,
            "26" : 2041004032.0000000000000000,
            "27" : 2041004032.0000000000000000,
            "28" : 2041004032.0000000000000000,
            "29" : 2041004032.0000000000000000,
            "30" : 2041004032.0000000000000000,
            "31" : 2041004032.0000000000000000,
            "32" : 2041004032.0000000000000000,
            "33" : 2041004032.0000000000000000,
            "34" : 2041004032.0000000000000000,
            "35" : 2041004032.0000000000000000,
            "36" : 2041004032.0000000000000000,
            "37" : 2041004032.0000000000000000,
            "38" : 2041004032.0000000000000000,
            "39" : 2041004032.0000000000000000,
            "40" : 2041004032.0000000000000000,
            "41" : 2041004032.0000000000000000,
            "42" : 2041004032.0000000000000000,
            "43" : 2041004032.0000000000000000,
            "44" : 2041004032.0000000000000000,
            "45" : 2041004032.0000000000000000,
            "46" : 2041004032.0000000000000000,
            "47" : 2041004032.0000000000000000,
            "48" : 2041004032.0000000000000000,
            "49" : 2041004032.0000000000000000,
            "50" : 2041004032.0000000000000000,
            "51" : 2041004032.0000000000000000,
            "52" : 2041004032.0000000000000000,
            "53" : 2041004032.0000000000000000,
            "54" : 2041004032.0000000000000000,
            "55" : 2041004032.0000000000000000,
            "56" : 2041004032.0000000000000000,
            "57" : 2041004032.0000000000000000,
            "58" : 2041004032.0000000000000000,
            "59" : 2041004032.0000000000000000
....

i want to know the avg memfree from '2014-10-13T23:00:00.000' to '2014-10-15T23:00:00.000'. so i need to sum all of values from 13 to 15 (3600*24*2 values). I think that the query will be something like this, but i don't know how to explain the avg command.....

db.metrics.aggregate( [
   { $match: { name: 'memfree' ,timestamp_hour:{$gte: ISODate("2014-10-13T23:00:00.000+02:00")},timestamp_hour:{$lte: ISODate("2014-10-15T23:00:00.000+02:00")} }  },
   {
     $group: {
        _id: "$name",
        avg: { $avg: "how can get all the values??" }
     }
   }
] )

any advice?

thanks

EDIT: correct answer (works for one to many metrics) is:

map = function() {
  for (var min in this.values)
    for (sec in this.values[min]){
      data = {value: {}, count: {}}
      data.value[this.name] = this.values[min][sec]
      data.count[this.name] = 1
      emit(this.name, data);
    }
}

reduce = function(key, values) {
  var sum = values.reduce(function(a, b) {
    out = {value: {}, count: {},avg:0}
    for (k in b.value){
        incount = a.count[k] || 0
        invalue = a.value[k] || 0
        out.value[k] = b.value[k]+invalue
        out.count[k] = b.count[k]+incount
    }
    out.avg = out.value[k]/out.count[k]
    return out
  });
  return sum;
}


printjson(db.node0208_26608.mapReduce(map, reduce,
               {
                 query: { name: {$in:['ioutil','memtotal','memfree']} ,
                          timestamp_hour:{$gte: ISODate("2014-09-22T00:00:00.000+02:00")},
                          timestamp_hour:{$lte: ISODate("2014-09-28T23:00:00.000+02:00")} 
                         },
                 //to write directly on a collection
                //out:{merge: "map_reduce_out"},
             out: {inline:1},
           verbose:true
               })
)

produces this result:

{
        "results" : [
                {
                        "_id" : "ioutil",
                        "value" : {
                                "value" : {
                                        "ioutil" : 2495762.106280909
                                },
                                "count" : {
                                        "ioutil" : 601200
                                },
                                "avg" : 4.15130090865088
                        }
                },
                {
                        "_id" : "memfree",
                        "value" : {
                                "value" : {
                                        "memfree" : 28500447903744
                                },
                                "count" : {
                                        "memfree" : 601200
                                },
                                "avg" : 47405934.636966065
                        }
                },
                {
                        "_id" : "memtotal",
                        "value" : {
                                "value" : {
                                        "memtotal" : 635834327040000
                                },
                                "count" : {
                                        "memtotal" : 594000
                                },
                                "avg" : 1070428160
                        }
                }
        ],
        "counts" : {
                "input" : NumberLong(499),
                "emit" : NumberLong(1796400),
                "reduce" : NumberLong(11),
                "output" : NumberLong(3)
        },
        "timeMillis" : 37956,
        "timing" : {
                "shardProcessing" : 37948,
                "postProcessing" : 8
        },
        "shardCounts" : {
                "192.168.0.19:27017" : {
                        "input" : 165,
                        "emit" : 594000,
                        "reduce" : 4,
                        "output" : 1
                },
                "192.168.0.20:27017" : {
                        "input" : 334,
                        "emit" : 1202400,
                        "reduce" : 7,
                        "output" : 2
                }
        },
        "postProcessCounts" : {
                "192.168.0.21:27017" : {
                        "input" : NumberLong(3),
                        "reduce" : NumberLong(0),
                        "output" : NumberLong(3)
                }
        },
        "ok" : 1
}
SUPERALEX
  • 51
  • 9

1 Answers1

1

This will be hard to achieve using the aggregation framework. But it "works" well with MapReduce. Something along the lines of that (untested):

// collect *individual* values
map = function() {
  for (var min in this.values)
    for (sec in this.values[min])
      data = {value: {}, count: {}}
      data.value[this.name] = this.values[min][sec]
      data.count[this.name] = 1
      emit(null, data);
}

// sum values and count
reduce = function(key, values) {
  var sum = values.reduce(function(a, b) {
    out = {value: {}, count: {}}
    for (k in b.value)
        incount = a.count[k] || 0
        invalue = a.value[k] || 0
        out.value[k] = b.value[k]+invalue
        out.count[k] = b.count[k]+incount

    return out
  });
  return sum;
}
Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125
  • ok...if I want the avg for 3 metrics (memfree,cputil,diskfree) I have to repeat the same routine 3 times with different query or it is possible to put OR statements in query and group the results at the end?maybe with emit(this.name, this.values[min][sec]); ? – SUPERALEX May 22 '15 at 09:52
  • @SUPERALEX I don't have time to write the details, but I've edited my answer accordingly – Sylvain Leroux May 22 '15 at 10:32
  • thanks first solution works; i slighty corrected the second. Only thing i do'nt understand is that if i run the two solutions with same metric it gives slighty different avg, but only if metric has not integer values...i think that there is some rounding problem but i don't care so much – SUPERALEX Jun 02 '15 at 13:51
  • i did some checks to find out why avgs was different, and it turns out that the first solution do the wrong avg...in reduce function i tried to write only sum and only values.length. sum is right (same of second solution), while values.length is much less than the total of map emits (which is the total count)...i don't know why, maybe the values array is splitted and reduce function is executed more times...in fact mongo db doc says: "Because it is possible to invoke the reduce function more than once for the same key....." – SUPERALEX Jun 02 '15 at 15:35
  • @SUPERALEX The first solution was simply wrong as it did not satisfy [the requirements for the reduce function](http://docs.mongodb.org/manual/reference/command/mapReduce/#mapreduce-reduce-cmd). Specifically, it wasn't associative. The second one (now the only one) is correct (AFAICT). – Sylvain Leroux Jun 02 '15 at 19:44
  • good to know...the second works if you emit also the name as the key in the mapper..with null value the a.count and a.value are resetted every time the key changes since the reduce is called only once – SUPERALEX Jun 02 '15 at 20:42