2

How to filter by both date and time independently in Mongo?

I want to be able to save both and date independently in the database where I could filter by dates and time independetenly.

Should I have only one field that saves a date object? If yes how could I filter by time. How could I get statistics about records for example that has a date from 01/05/2017 to 15/03/2018 AND the records should has a time that lies between 3 pm and 5pm (So a record that has a date of 12/03/2018 and a time of 5:05pm should not get included).

Yousof Sharief
  • 590
  • 6
  • 10
  • Could you show sample data and expected output ? – mickl May 01 '18 at 14:41
  • The answer really depends on what performance you need. If you need to be able to get the results of this query using indexes then you should add fields for the times, but if a scan of the documents matching the date query portion is fine you can use just a date field. – Ian Mercer May 01 '18 at 15:59
  • I think a date field is enough but how could I actually do this filtering using mongo? – Yousof Sharief May 01 '18 at 21:04
  • I think this answered my question https://stackoverflow.com/questions/39868872/how-to-filter-data-between-two-times-from-hhmm-to-hhmm-in-mongodb – Yousof Sharief May 02 '18 at 11:09

2 Answers2

3

The important thing to consider here is that you actually need to keep using the "regular query operators" here or you are going to drastically loose performance. The idea is to always write a query that can actually "use an index", and then to make sure you have the index present.

Therefore the selection of "days" is a standard range query, and the remainder only falls back to filtering expressions with "computations" in consideration of the documents "already selected" by correctly specifying the standard query conditions first:

MongoDB 3.6 - $expr

db.collection.find({
  "date": { "$gte": new Date("2017-05-01"), "$lt": new Date("2018-03-16") },
  "$expr": {
    "$and": [
      { "$gte": [{ "$hour": "$date" }, 15 ] },
      { "$lt": [{ "$hour": "$date", 17 ] }
    ]
  }
})

Uses the $expr query operator to evaluate aggregation framework logical operators and date operators. Regular comparison operators for date range expression.

Lower versions - Aggregate and $redact

db.collection.aggregate([
  { "$match": { 
    "date": { "$gte": new Date("2017-05-01"), "$lt": new Date("2018-03-16") }
  }},
  { "$redact": {
    "$cond": {
      "if": {
        "$and": [
          { "$gte": [{ "$hour": "$date" }, 15 ] },
          { "$lt": [{ "$hour": "$date", 17 ] }
        ]
      },
      "then": "$$KEEP",
      "else": "$$PRUNE"
    }
  }}
])

Same aggregation expressions but applied using the $redact pipeline stage. Same Regular comparison operators for date range expression withing $match.

All versions - $where

var startHour = 15,
    endHour = 17;

db.collection.find({
  "date": { "$gte": new Date("2017-05-01"), "$lte": new Date("2018-03-15") },
  "$where": `this.data.getUTCHours() => ${startHour}
    && this.data.getUTCHours() < ${endHour}`
})

Using JavaScript evaluation with $where. Available in all versions unless server side scripting has been explicitly disabled. Note that the "same" and Regular comparison operators are used for the main selection of the date range.

In ALL cases it is imperative to express the "standard query operator" condition "first". Without that MongoDB cannot use an index where present on the data and would need to scan every document within the collection in order to calculate the conditions and see whether to return the document or not.

Adding the "standard operator" condition with the $gte and $lt range makes sure that index can be used, and the remaining "calculated" logical expression is only actually applied to those documents which already met the "first" condition.

For "bonus" you can even put your time constraint on the "days" themselves, so you don't even consider times before 3PM or after 5PM on the starting and ending days respectively:

"date": { "$gte": new Date("2017-05-01T15:00"), "$lt": new Date("2018-03-16T17:00") }

So where possible always "use an index" and make sure that your query expressions are actually constructed to used them, as opposed to other forms that are not.


NOTE The general logic here is that "performance" of all presented solutions here should scale in the order of presentation, being $expr best to $where worst. At present writing however, there appears to be a regression in the MongoDB 3.6 release where in fact the $where typically performs better that it's counterparts actually using native operators.

This should not be the "ongoing" case however and should get resolved, so it is generally recommended you use the native operator compositions as opposed to JavaScript logic with $where.

Community
  • 1
  • 1
Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
  • This means that the date would be saved as a normal javascript date object in the database and not ISOstring? – Yousof Sharief May 01 '18 at 21:52
  • 1
    @YoussefSherif MongoDB stores BSON Dates. They are not strings, and you should not have strings. We simply use the "native" data operation for each individual language and the driver "converts" them. For JavaScript that's `Date()`. See [Find objects between two dates MongoDB](https://stackoverflow.com/q/2943222/2313887) the the general explanation that's been around for years. But you want that base query, and only calculate the "hours" to keep "separately" – Neil Lunn May 01 '18 at 21:56
2

Having documents like below:

db.col.save({ date: new Date("2018-03-02T16:00:00Z") }); // matches
db.col.save({ date: new Date("2018-03-16T16:00:00Z") }); // doesn't match
db.col.save({ date: new Date("2017-05-02T19:00:00Z") }); // doesn't match
db.col.save({ date: new Date("2017-06-15T15:00:00Z") }); // matches

You can use $redact to specify condition as an expression. To extract an hour from Date you can use $hour operator:

db.col.aggregate([
    {
        $redact: {
            $cond: {
                if: { $and: [
                        { $gte: [ "$date", new Date("2017-05-01T00:00:00Z") ] },
                        { $lt: [ "$date", new Date("2018-03-16T00:00:00Z") ] },
                        { $gte: [ { $hour: "$date" }, 15 ] },
                        { $lt: [ { $hour: "$date" }, 17 ] }
                    ] 
                },
                then: "$$KEEP",
                else: "$$PRUNE"            
            }
        } 
    }
])
mickl
  • 48,568
  • 9
  • 60
  • 89
  • `$lt ...17` on the hour portion otherwise this goes up to almost 6PM: 17:59 (assuming it's OK to exclude 5PM itself) – Ian Mercer May 01 '18 at 16:01
  • 1
    @IanMercer thank you, modified my answer. Now it excludes 5PM, to include it there should be much more complex condition with `$minutes` and `$or`. I guess that's simplest solution – mickl May 01 '18 at 16:33