1

How can I group by ID and Month in MongoDB?

My data looks like this:

{ 
    "_id" : ObjectId("597225c62e7cbfc9a0b099f8"), 
    "LogId" : NumberInt(17351963), 
    "EntryId" : NumberInt(22), 
    "Date" : "2013-08-11 00:00:00", 
    "LogTypeId" : NumberInt(6), 
    "Count" : NumberInt(1), 
    "EntryType" : NumberInt(1)
}
{ 
    "_id" : ObjectId("597225c62e7cbfc9a0b099f9"), 
    "LogId" : NumberInt(17352356), 
    "EntryId" : NumberInt(23), 
    "Date" : "2013-08-11 00:00:00", 
    "LogTypeId" : NumberInt(6), 
    "Count" : NumberInt(2), 
    "EntryType" : NumberInt(1)
}
{ 
    "_id" : ObjectId("597225c62e7cbfc9a0b099fa"), 
    "LogId" : NumberInt(17360483), 
    "EntryId" : NumberInt(28), 
    "Date" : "2013-08-11 00:00:00", 
    "LogTypeId" : NumberInt(6), 
    "Count" : NumberInt(1), 
    "EntryType" : NumberInt(1)
}

My simplified aggregation query runs without errors, but it doesn't group:

db.log.aggregate([
    {"$group":{"_id":"$EntryId", "Count":{"$sum":"$Count"}}},
    {"$sort": {"EntryId": 1}}
])

Ultimately, I want to group by EntryID and the month of the date column.

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
NickG
  • 9,315
  • 16
  • 75
  • 115
  • Well you could, but since your dates are actually "strings" that may become a little difficult. Not impossible but difficult. You should cast the strings as BSON `Date`, since it takes a lot less space to store and is far more useful that way. – Neil Lunn Jul 24 '17 at 09:01
  • Only started MongoDB today - so I hadn't realised that. I imported from CSV so perhaps I did something wrong. – NickG Jul 24 '17 at 09:02
  • If you imported a CSV from say MySQL, then you got strings. Best to convert them. – Neil Lunn Jul 24 '17 at 09:03
  • However that doesn't change the fact that it doesn't even sort by EntryId at the mo. – NickG Jul 24 '17 at 09:03
  • 1
    Well that's because it's a "pipeline" and just like "Unix pipe" `|` you only can put in to the next command what came "out" of the previous. So the field is not `EntryId` anymore, but it's now `_id`. Get it now? – Neil Lunn Jul 24 '17 at 09:04
  • Thanks - that's working now :) Just need to workout how to convert my dates. – NickG Jul 24 '17 at 09:06

1 Answers1

1

First of all "convert your strings to date" with a very simple operation:

let ops = [];

db.log.find().forEach(doc => {
  ops.push({ "updateOne": { 
    "filter": { "_id": doc._id },
    "update": { "$set": { "Date": new Date(doc.Date.replace(" ","T")) } }
  }});

  if ( ops.length >= 500 ) {
    db.log.bulkWrite(ops);
    ops = [];
  }        
});

if ( ops.length > 0 ) {
  db.log.bulkWrite(ops);
  ops = [];
};

Then run the new aggregate:

db.log.aggregate([
  { "$group": {
    "_id": { 
      "EntryId": "$EntryId",
      "year": { "$year": "$Date" },
      "month": { "$month": "$Date" }
    },
    "Count": { "$sum": 1 }
  }},
  { "$sort": { "_id": 1 } }
])

Also noting that even a "compound _id" like this one will sort correctly to it't numeric values.

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
  • (I have 15 million records) – NickG Jul 24 '17 at 13:43
  • Oh - I guess that's because it's already converted "some" of the records :/ – NickG Jul 24 '17 at 14:19
  • @NickG You could always run a query looking for `{ "Date": { "$type": 2 } }` which are strings if you ran "part" of the operation and then aborted. Overall though you need to convert to `Date` somehow, and the only other viable option is to go to JSON input and mark as a `$date` type with `mongoimport`. See ["extended json"](https://docs.mongodb.com/manual/reference/mongodb-extended-json/#date). But there also needs to be some realism here as 15 million is a pretty big number. You should [select a date range](https://stackoverflow.com/questions/2943222/find-objects-between-two-dates-mongodb) – Neil Lunn Jul 25 '17 at 08:21
  • 1
    @NickG Beyond that You simply [*'need to get a bigger boat"*](http://www.urbandictionary.com/define.php?term=We%27re%20Gonna%20Need%20a%20Bigger%20Boat) if you believe things are taking too long, as it becomes a "hardware" restriction. But correcting "types" and adding an "index" as well has "making a reasonable selection" of which all three you have been pointed to, should reasonably get you on your way. And with a fair bit more understanding than when you asked the question. Some would call that "helpful". Also please watch how many comments you leave. Some of us do sleep occasionally. – Neil Lunn Jul 25 '17 at 08:24