1

I'm trying to clean my email database that was imported from an excel table. So there's plenty of bad characters, and even two or three emails in the same cell. I dont want to use a direct database solution as was posted here (T-SQL: checking for email format), because I will do a double check by eye before actually excluding it.

1) First I got all of those emails that weren't good shaped. Then I transformed them on to an array. Note that I'm in laravel ecosystem.

$contato = DB::select("select * from emailstable where outro_email NOT REGEXP '^[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9]@[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9]\.[a-zA-Z]{2,4}$'");

$email_array = json_decode(json_encode($email_database_as_object), true); 

2) From those, I eliminated all those records that do not have the @ symbol on it (empty, null, random phrases), excluding them from the original array:

$corretor = preg_grep("/@/i", array_column($email_array, "email"), PREG_GREP_INVERT);

foreach ($corretor as $key => $value) {
         $email_array = array_except($email_array, array($key));
    }

But the biggest problem is that when I'm trying to remove the bad characters, the preg_grep assigns to the resulting array, new array keys. Instead of keeping the original ones.

As an exemple: Original array keys that were filtered: 1,4,10,24,34,65,78 (7 keys) Assigned keys: 0,1,2,3,4,5,6

On my code, I'm trying to extract the multiple emails that was inserted in one cell, through separators as ";" , ",' and " ", using preg_grep:

$email_corrected = array(); //array to get all of the corrected emails

$corretor = preg_grep("/;/i", array_column($email_array, "outro_email"));
foreach ($corretor as $key => $value) {
   $provisorio = explode(';', $value);
   $provisorio  = array_where($provisorio, function($chave, $valor)
   {
      return strlen($valor) > 0;
  }); //laravel function to take out empty results of explode
   $provisorio=array_map('trim',$provisorio);
   $email_corrected[$key] = $provisorio; //adds the result to corrected emails
   $email_array = array_except($email_array, array($key)); // takes out the result from the original array
} 


$corretor = preg_grep("/,/i", array_column($email_array, "outro_email"));
foreach ($corretor as $key => $value) {
   $provisorio = explode(',', $value);
   $provisorio  = array_where($provisorio, function($chave, $valor)
   {
      return strlen($valor) > 0;
  });
   $provisorio=array_map('trim',$provisorio);
   $email_corrected[$key] = $provisorio;
   $email_array = array_except($email_array, array($key));
}

$corretor = preg_grep("/\//", array_column($contato_array, "outro_email"));
foreach ($corretor as $key => $value) {
   $provisorio = explode(',', $value);
   $provisorio  = array_where($provisorio, function($chave, $valor)
   {
      return strlen($valor) > 0;
  });
   $provisorio=array_map('trim',$provisorio);
   $email_corrected[$key] = $provisorio;
   $email_array = array_except($email_array, array($key));
}

But after each preg_grep, the keys from $corretor does not match the original keys from $email_array. As a result, Im not eliminating the correct keys from the original array when doing $email_array = array_except($email_array, array($key));

Thanks in advance.

Community
  • 1
  • 1
Sam
  • 11
  • 2

0 Answers0