4

I'm using MiniMongo through Meteor, and I'm trying to create a frequency table based off of a dynamic set of queries.

I have two main fields, localHour and localDay. I expect many overlaps, and I'd like to determine where the most overlaps occur. My current method of doing this is so.

if(TempStats.findOne({
            localHour: hours,
            localDay: day
          })){//checks if there is already some entry on the same day/hour

            TempStats.update({//if so, we just increment frequency
              localHour: hours,
              localDay: day
            },{
              $inc: {freq: 1}
            })

          } else {//if nothing exists yet, we put in a new entry

            TempStats.insert({
              localHour: hours,
              localDay: day,
              freq: 1
            });

          }

Essentially, this code runs every time I have new data I want to insert. It works fine at the moment, in that, after all data is inserted, I can sort by frequency to find what set of hours & days occurs the most often (TempStats.find({}, {sort: {freq: -1}}).fetch()).

However, I'm looking more for a way to search by frequency for any key. For instance, searching for the day which everything occurs on the most often as opposed to both the date and hour. With my current way of doing this, I would need to have multiple databases and different methods of inserting for each, which is a bit ridiculous. Is there a Mongo (specifically MiniMongo) solution to do frequency maps based on keys?

Thanks!

mjkaufer
  • 4,047
  • 5
  • 27
  • 55

3 Answers3

3

It looks like miniMongo does not in fact support aggregation, which makes this kind of operation difficult. One way to go about it would be aggregating yourself at the end of each day and inserting that aggregate record into your db (without the hour field or with it set to something like -1). Conversely as wastefully you could also update that record at the time of each insert. This would allow you to use the same collection for both and is fairly common in other dbs.

Also you should consider @nickmilon's first suggestion since the use of an upsert statement with the $inc operator would reduce your example to a single operation per data point.

Shawn
  • 56
  • 4
2
  1. a small note on your code: the part that comes as an else statement is not really required your update will do the complete job if you combine it with the option upsert=true it will insert a new document and $inc will set the freq field to 1 as desired see: here and here
  2. for alternative ways to count your frequencies: assuming you store the date as a datetime object I would suggest to use an aggregation (I am not sure if they added support for aggregation yet in minimongo) but there are solutions then with aggregation you can use datetime operators as $hour, $week, etc for filtering and $count to count the frequencies without you having to keep counts in the database.
nickmilon
  • 1,332
  • 1
  • 10
  • 9
1

This is basically a simple map-reduce problem.

First, don't separate the derived data into 2 fields. This violates DB best practices. If the data comes to you this way, use it to create a Date object. I assume you have a bunch of collections that are being subscribed to and then you aggregate all those into this temporary local collection. This is the mapping of the map-reduce pattern. At this point, since your query in unknown, it's a waste of CPU (even though it's your client) to aggregate. Map first, reduce second. What you should have is a collection full of datetimes. call it TempMapCollection if you wish. Now, use a forEach() and pass in your reduce function (by day, by hour, etc).

You can reduce into another local collection, or into a javascript object. I like using collections, but if the objects are complex, you'll get EJSON errors all up in there. Since your objects are nothing more than a datetime, let's use collections.

so you've got something like:

TempMapCollection.find().forEach(function(doc) {
    var date = doc.dateTime.getDate();
    TempReduceCollection.upsert({timequery: hours}, {$inc: {freq: 1}});
})

Now query your reduce collection. This has the added benefit that you won't have to re-map if you want to do 2 unique queries.

Matt K
  • 4,813
  • 4
  • 22
  • 35