I get new 10000s of xml files data everyday.
and I always have run a query to see if there is any new data into those XML files and if that doesn't exists into our database then insert that data into our table.
Here is the code
if(!Dictionary::where('word_code' , '=' , $dic->word_code)->exists()) {
// then insert into the database.
}
where $dic->word_code is coming from thousands of XML files. every time it opens up the new XML file one by one then check this record exists then open a new XML file and check if it doesn't exists then insert the record then move to another file and do the same procedure with 10000 XML of files.
each XML file is about 40 to 80mb which has lots of data.
I already have 2981293 rows so far and checking against 2981293 rows with my XML files then inserting the row seems to be really time-consuming and resource greedy task.
word_code is already index.
The current method takes about 8 hours to finish up the procedure.
By the way I must mention this that after running this huge procedure of 8 hours it downloads about 1500 to 2000 rows of data per day.