lets see my code:
function checkForDuplicates() {
$data = $this->input->post();
$project_id = $data['project_id'];
$this->db->where('project_id', $project_id);
$paper = $this->db->get('paper')->result();
$paper2 = $paper; //duplica o array de papers
$duplicatesCount = 0;
foreach($paper as $p){
$similarity = null;
foreach($paper2 as $p2){
if($p -> status_selection_id !== 4 && $p2 -> status_selection_id !== 4){
if($p -> paper_id !== $p2 -> paper_id){
similar_text($p -> title, $p2 -> title, $similarity);
if ($similarity > 90) {
$p -> status_selection_id = 4;
$this->db->where('paper_id', $p -> paper_id);
$this->db->update('paper', $p);
$duplicatesCount ++;
}
}
}
}
}
$data = array(
'duplicatesCount' => $duplicatesCount,
'message' => 'Duplicates where found!'
);
echo json_encode($data);
}
- similar_text takes 180 seconds to check 1500 records.
- levenshtein takes 101 seconds to check 1500 records.
- if($pp1 === $pp2) takes 45 seconds to check 1500 records.
what would be the quickest way to check duplicate records and change their status?