I have a .csv file that is about 5mb (~45,000 rows). What I need to do is run through each row of the file and check if the ID in each line is already in a table in my database. If it is, I can delete that row from the file.
I did a good amount of research on the most memory efficient way to do this, so I've been using a method of writing lines that don't need to get deleted to a temporary file and then renaming that file as the original. Code below:
$file= fopen($filename, 'r');
$temp = fopen($tempFilename, 'w');
while(($row = fgetcsv($file)) != FALSE){
// id is the 7th value in the row
$id = $row[6];
// check table to see if id exists
$sql = "SELECT id FROM table WHERE id = $id";
$result = mysqli_query($conn, $sql);
// if id is in the database, skip to next row
if(mysqli_num_rows($result) > 0){
continue;
}
// else write line to temp file
fputcsv($temp, $row);
}
fclose($file);
fclose($temp);
// overwrite original file
rename($tempFilename, $filename);
Problem is, I'm running into a timeout while executing this bit of code. Anything I can do to make the code more efficient?