2

I'm trying to find a better way to ensure certain documents are removed from a mongo collection at a specific time, which is unique to each document. I also need to run some methods when the items are removed. I've looked into TTL indexes, but it seems they don't allow any kind of callback, and from what I read the process that removes the documents only runs once per minute, which isn't specific enough for what I need. The following is what I came up with:

var check_frequency = 30000;
Meteor.setInterval((function() {
    // figure out what elements will expire within the next check period
    var next_check = moment().add(check_frequency, 'milliseconds');
    var next_string = next_check._d.toISOString();

    var ending_items = items.find({'time_to_end': {$lt : next_string}});

    ending_items.forEach(function(db_object) {
        var time_from_now = moment(db_object.time_to_end) - moment();
        Meteor.setTimeout(function() {
            removeAndReport(db_object._id);
        }, time_from_now);

    });
}), check_frequency);

My concern is I'm not sure how Meteor.setTimeout() works with threading, so if I have hundreds or thousands of these calls I'm wondering if it will cause problems. Can anyone recommend a better way to accomplish this?

Thanks in advance.

edit: Running background jobs with Meteor or cron isn't my only concern. I realize I could accomplish the same thing with a cron job, but I'd rather not query my databases once per second to only find 3 expiring items vs. querying the database once every 30 seconds, and figure out which elements will expire in the next time period.

EindacorDS
  • 107
  • 1
  • 3
  • 11
  • If you have something that looks like a regular background task in meteor, you should always try [synced-cron](https://github.com/percolatestudio/meteor-synced-cron) first. It's probably what you want. – David Weldon Nov 11 '15 at 18:40
  • I specified my question with an edit. Whether I run the process with Meteor or a cron task, is the above the best way to make sure things are happening at specific times if there are potentially thousands of these instances? – EindacorDS Nov 11 '15 at 18:47
  • An "instance" is a document that needs removal, or a running instance of your app? Also, can you explain the constraints in more depth? E.g. does it matter if a particular document is removed at precisely the end of its TTL vs, say 5 minutes later? – David Weldon Nov 11 '15 at 18:55
  • I also added a key point to the question: The time is unique to each document on the database. So it can't be a scheduled task that is universal for all documents, as the specific time of removal varies. Please reopen. – EindacorDS Nov 11 '15 at 18:57
  • An "instance" would be a scheduled method that removes and processes a document, sorry for the bad wording. – EindacorDS Nov 11 '15 at 18:58
  • "does it matter if a particular document is removed at precisely the end of its TTL vs, say 5 minutes later?" - I specified this in the original question. – EindacorDS Nov 11 '15 at 19:02
  • Sure, but you didn't specify the level of precision actually required. Like is a 10 ms delay acceptable? How about 1000 ms? – David Weldon Nov 11 '15 at 19:28

1 Answers1

3

It seems like an easier solution is to store the removal date in each document, rather than the TTL. Imagine you had a collection called Messages, and each document had a removeAt field. Then you can do something like the following:

var timeout = 500;
Meteor.setInterval((function() {
  // remove any messages that should have been removed in the past
  Messages.remove({removeAt: {$lte: new Date}});
}), timeout);

Notes:

  1. Make sure to index removeAt so the remove doesn't need to scan your whole collection.
  2. Technically this won't breack if it runs on multiple server instances, but ideally it would only run on one. Maybe it could be run in it's own process.
David Weldon
  • 63,632
  • 11
  • 148
  • 146
  • I need the documents to be removed down to a specific second though, and I don't necessarily want to run a query every second. To be VERY specific, I'm trying to make an auction house, and I'm storing the auction items in a database. I need the auctions to expire after a specific time, and I'd like the method of clearing auctions to be set by an interval instead of setting a timeout based on the auction duration. – EindacorDS Nov 11 '15 at 20:11
  • I suppose I could just prevent bids after the expiration time, and remove the auctions at a later point like you suggested, but there are a few other similar cases in my program that I'm trying to wrap my head around. – EindacorDS Nov 11 '15 at 20:12
  • I'd definitely recommend preventing bids after the expiration time - that should be easy to implement and gives a strong guarantee of consistency. As for the precision, the above example removes them every 0.5 seconds - you could certainly go lower if necessary. A fast running query a few times a second shouldn't cause any noticeable DB load unless you have thousands of auctions ending every second. – David Weldon Nov 11 '15 at 20:16
  • Does the size of the db matter though? If I have a db with a few thousand auctions, won't running a query every second effect performance? I don't have much experience with load testing like that. – EindacorDS Nov 11 '15 at 20:20
  • Not of the documents are indexed by `removeAt`. Regardless of the number of documents, query will immediately find those that need removal without scanning the whole collection. If it were not indexed, you'd have a performance problem. – David Weldon Nov 11 '15 at 20:22
  • Fair enough, that's a lot easier than I was making it. I'll refactor and go from there. Cheers. – EindacorDS Nov 11 '15 at 20:31