I'm trying to clean my email database that was imported from an excel table. So there's plenty of bad characters, and even two or three emails in the same cell. I dont want to use a direct database solution as was posted here (T-SQL: checking for email format), because I will do a double check by eye before actually excluding it.
1) First I got all of those emails that weren't good shaped. Then I transformed them on to an array. Note that I'm in laravel ecosystem.
$contato = DB::select("select * from emailstable where outro_email NOT REGEXP '^[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9]@[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9]\.[a-zA-Z]{2,4}$'");
$email_array = json_decode(json_encode($email_database_as_object), true);
2) From those, I eliminated all those records that do not have the @ symbol on it (empty, null, random phrases), excluding them from the original array:
$corretor = preg_grep("/@/i", array_column($email_array, "email"), PREG_GREP_INVERT);
foreach ($corretor as $key => $value) {
$email_array = array_except($email_array, array($key));
}
But the biggest problem is that when I'm trying to remove the bad characters, the preg_grep assigns to the resulting array, new array keys. Instead of keeping the original ones.
As an exemple: Original array keys that were filtered: 1,4,10,24,34,65,78 (7 keys) Assigned keys: 0,1,2,3,4,5,6
On my code, I'm trying to extract the multiple emails that was inserted in one cell, through separators as ";" , ",' and " ", using preg_grep:
$email_corrected = array(); //array to get all of the corrected emails
$corretor = preg_grep("/;/i", array_column($email_array, "outro_email"));
foreach ($corretor as $key => $value) {
$provisorio = explode(';', $value);
$provisorio = array_where($provisorio, function($chave, $valor)
{
return strlen($valor) > 0;
}); //laravel function to take out empty results of explode
$provisorio=array_map('trim',$provisorio);
$email_corrected[$key] = $provisorio; //adds the result to corrected emails
$email_array = array_except($email_array, array($key)); // takes out the result from the original array
}
$corretor = preg_grep("/,/i", array_column($email_array, "outro_email"));
foreach ($corretor as $key => $value) {
$provisorio = explode(',', $value);
$provisorio = array_where($provisorio, function($chave, $valor)
{
return strlen($valor) > 0;
});
$provisorio=array_map('trim',$provisorio);
$email_corrected[$key] = $provisorio;
$email_array = array_except($email_array, array($key));
}
$corretor = preg_grep("/\//", array_column($contato_array, "outro_email"));
foreach ($corretor as $key => $value) {
$provisorio = explode(',', $value);
$provisorio = array_where($provisorio, function($chave, $valor)
{
return strlen($valor) > 0;
});
$provisorio=array_map('trim',$provisorio);
$email_corrected[$key] = $provisorio;
$email_array = array_except($email_array, array($key));
}
But after each preg_grep, the keys from $corretor does not match the original keys from $email_array. As a result, Im not eliminating the correct keys from the original array when doing $email_array = array_except($email_array, array($key));
Thanks in advance.