-1

Millions of mails are inserted into a MySQL table "EmailQueue" with following fields,

  1. id - BigInt(20)
  2. email_address - Varchar(300)
  3. message - text
  4. status - enum('Pending','Prioritized','Processing','Processed')
  5. created_datetime - datetime
  6. sent_datetime - datetime

Generally the rows will be inserted with 'Pending' status but some high priority mails such as Forget/Reset password will be inserted with 'Prioritized' in status column.

The cron job will run every hour and send the mails as a loop with the batch of 20000 mails in every loop until it finish sending all the emails. Now I want to send the prioritized mails first which can be add to the email queue even when the cron job is running.

What is the best approach to achieve this? I'm not sure if stackoverflow is the place to ask this question, but not sure about a better place. Thanks for any help in advance.

Stranger
  • 10,332
  • 18
  • 78
  • 115
  • 1
    if you want it to see them even when the job is already running then simply the cron job would have to query again before each email it sends to check if anything has moved to the top of the queue. Sounds like it'd not be all that efficient. If they're so urgent, why not run the job every 5 minutes or so, and make the prioritised ones top of the list using a SQL "ORDER BY", and just send anything that hasn't been sent before? If you absolutely can't send more than 20000 an hour for some reason, you can use timestamps on the DB fields to check how many sent so far in all runs during the hour. – ADyson Sep 12 '17 at 14:39
  • @ADyson valid point – Stranger Sep 12 '17 at 14:45

2 Answers2

1

If we ignore the "add while the cron is running" for a sec, this will select 'Prioritized' first:

ORDER BY FIELD(status, "Prioritized"), id ASC

This'll sort all rows where status=Prioritized first then it will order by id. More info/examples here.


Adding them while the cron is running is more difficult, this becomes a logic challenge. If you do SELECT * FROM emails ORDER BY FIELD(status, "Prioritized"), id ASC you select the data in the data at the time of the selecting. If items are added after you've run the query, it wont be in the returned set of data.

To get what you want, you'll need to break your code into smaller selections:

$continueProcesss = false;
$current = 0;
$itemsPerBatch = 25;
while( $continueProcesss ){
    $query = "SELECT * FROM emails ORDER BY FIELD(status, 'Prioritized'), id ASC` 
              LIMIT $current,$itemsPerBatch";
    $result = yourQueryMethod($query);
    if( $result->num_rows===0 ){
        $continueProcesss = false;
        break;
    } else{
       $current += $itemsPerBatch; // next round, we skip another $itemsPerBatch rows
    }
}
Martijn
  • 15,791
  • 4
  • 36
  • 68
0

It's a matter of using the right query, for example, if you know the job will run every email until the queue is exhausted:

SELECT * 
FROM EmailQueue 
WHERE status IN ('Pending','Prioritized') 
ORDER BY status DESC, created_datetime ASC
LIMIT 0,100;

The result is that every time you run the query, you get the prioritized emails first, then the pending emails, both ordered by the oldest first.

You can run this batch any number of times you need until the queue is depleted or a max of 200 times every hour to match the 20,000 hourly limit.

I'm assuming that when you start processing you change the status to Processing and when finished you change the status to Processed. That's why you always start with 0 and not an incremental number.

Abraham Romero
  • 1,047
  • 11
  • 22