INSERT INTO required_data (
keyword_id, search_engine_id, date, " . $periodType . "
) VALUES (
:keyword_id, :search_engine_id, :date, TRUE
) ON DUPLICATE KEY UPDATE " . $periodType . " = TRUE, received = FALSE
This particular query is currently inserting around 2.6-2.8 million rows on a typical day, it handles that just fine. In those cases a task checks and executes this on an hourly basis. But we also have a weekly, and a monthly set of inserts. That's another very large chunk of data to insert on those occasions, when these run we have issues due to how long they take to finish.
When these take too long it locks up other tasks, I plan to break the tasks up into more manageable chunks to spread the load a little better and keep everything running when it needs to throughout the day/week/month but it's hardly ideal. If anyone knows how I could make this more efficient that would be great. This runs on a large ec2 instance.