I have developed JSON API in the slim framework of PHP. Using this API I send contact numbers from my mobile device to the server. The server saves these contacts in the MySQL database.
I hosted the API using digital ocean hosting with 4 core CPU, 8 GB ram. In order to stress test, I sent 1000 concurrent POST
requests to the API with 500 contact numbers in the body of each request and suddenly server got too much busy processing those requests and it stopped taking any further requests. I opened APACHE error.log
and I saw this message there:
[Fri May 15 13:34:14.112205 2020] [mpm_prefork:error] [pid 42748] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
After seeing these logs I looked at the server control panel stats which are offered by the digital ocean and I saw CPU was used 64%, Memory 16% but Disk I/O was on its peak. After seeing this I came to know that there is something in my PHP code that is taking CPU time and from Disk I/O usage the first thing that came in my mind it was the number of queries that I am running in for loop. I searched about it and came to now that running a query in a loop is a bad practice.
It's been two days and I am unable to change database design to avoid those queries in the for loop. As I have provided the problem background now let me explain how I am saving those contacts in the database.
First here is my database ERD.
From Request body, I extract JSON array of contacts and loop on that contacts array. Then for each contact, I check if it already exists in my database using this query.
Select id from contact where number = 'mobile number' and full_name = 'name' and country_id = 12;
I have created index on number
full_name
country_id
in database. if the contact exists this query return me the id
of that contact otherwise it returns FALSE
. If I get the id
I create a record in contact_seeder
table if it does not exist. to check if it already exists I query contact_seeder
table like this.
Select id from contact_seeder where contact_id = 12 and seeder_id = 2;
A contact can have only 1 'contact_seeder' record against 1 seeder_id
. That's why I have checked if 'contact_seeder' exist against some contact_id
and seeder_id
.
As you can see I have to check if the record already exists and if yes then gets it's id otherwise create a new record. So in order to insert ONE contact, I am running 4 queries in worst-case scenario and in best case scenario it would 3, and yes these queries in a loop. So for 100 contacts, it would be. 4*100 = 400
queries. That's what is slowing down my server.
I hope i have provided enough details for you to understand my problem please suggest a solution to how I can avoid those queries in the loop and sill achieve my target goal. Before mentioning bulk insertion please check I need id
of each record that is created for example contact
. In bulk insertion, it is another problem to get the ids of created records.