It's important to realize that there are only so many options you have with your code - this issue is mostly related to utilising the hardware you have optimally.
The problem: insert 800 000 records across 24 hours having no record loss.
Potential issue: you haven't mentioned what the peak insert rate would be (all 800k records could occur within the same hour for example) or you could have an even spread of ~10 inserts per second (quite low really).
The real problem behind this is utilising the permanent storage device or the hard drive. You said you need no record loss - this is only possible with a transactional engine and it rules NoSQL out so please don't consider it at all. At the moment you have InnoDB
and TokuDB
as transactional engines, with InnoDB
being the default engine.
If you have a mechanical hard drive, you need to measure how many IOPS it's capable of (Input Output Operations per Second). This number varies from drive to drive. It tells you how many times per second it can read or write.
For an SSD this number is much higher, some drives are capable of 100 000 IOPS.
This unit of measurement is important because it tells you how many times a transactional engine will be able to force the disk to store the data. Non-transactional engines don't force the disk to store the data, so OS is capable of scheduling the writes - this is why NoSQLs are famous for losing data - they tell you it's written but it's in a buffer which can get lost before the drive commits the data.
Now, you have to determine whether you're fine with the IOPS you have available or not. One query = 1 I/O. This will tell you how many inserts per second you are capable of doing. Queueing inserts and then sending them to the DB to write down in 1 I/O is an optimization technique, people use it when they want to trade the bandwidth of the disk for the I/O. Basically, it means grouping several inserts (a 100 or so) and then you send it in a BEGIN TRANSACTION
/ COMMIT
block which lets you write down a lot of data in a single I/O. It's a bit tricky to implement this queueing mechanism because you have to code a small long-running program which is capable of queueing the requests / querying the db.
After you've determined what you have available in terms of processing power, you can start doing things right, like the nice guys at the comments mentioned - do it properly.
Use PDO and use prepared statements. This step is crucial. Not only is it easier for the database to process, it's easier for you and it yields the performance you desire.
Measure the potential performance of the server you are using.
Optimize the hardware if possible - you can only do so much with the code. If you produce the best possible code in the universe, your constraining factor will be the hardware and no matter what kind of software you stick there, if a limit is hit - that's it. At this point you are looking either to increase the hardware processing power or to split the data into multiple shards.
These are only a few tips, as you can see - this topic is pretty huge. However, I believe I gave you a few things to google about. Whatever you decide to do - good luck!