0

Firstly, I realise this may appear as a duplicate as I have read a number of questions on a similar topic (1, 2) but I'm struggling to see how to re-architect the code base to fit my senario.

I am attempting to take an existing multi-dimensional array and remove any nodes that have a duplicate in a specific field. Here is dataset I am working with:

array(3) {
  [0]=>
  array(3) {
    ["company"]=>
    string(9) "Company A"
    ["region"]=>
    string(4) "EMEA"
    ["ctype"]=>
    string(8) "Customer"
  }
  [1]=>
  array(3) {
    ["company"]=>
    string(9) "Company A"
    ["region"]=>
    string(4) "EMEA"
    ["ctype"]=>
    string(8) "Customer"
  }
  [2]=>
  array(3) {
    ["company"]=>
    string(9) "Company C"
    ["region"]=>
    string(4) "EMEA"
    ["ctype"]=>
    string(8) "Customer"
  }
}

If this wasn't a multi-dimensional array would use in_array() to see if the dataset['company'] existed. If not I'd add it to my $unique array, something like this:

$unique = array();

foreach ($dataset as $company) {
  $company_name = $company['company'];

  if ( !in_array($company_name, $unique) ) {
    array_push($unique, $company_name);
  }
}
var_dump($unique);

But I'm unsure how to traverse the muti-dimensional array to get to the ['company'] data to see if it exists (as it is the only item I need to check to see if it already exists).

I am looking to output exactly the same data as the initial dataset, just with the duplicate removed. Please can you point me in the right direction?

Community
  • 1
  • 1
Sheixt
  • 2,546
  • 12
  • 36
  • 65

4 Answers4

1

Store already checked companies in some side-array:

$unique = array();
$companies = array();

foreach ($dataset as $company) {
    $company_name = $company['company'];

    if ( !in_array($company_name, $companies) ) {
        array_push($unique, $company);
        array_push($companies, $company_name);
    }
}

var_dump($unique);
u_mulder
  • 54,101
  • 5
  • 48
  • 64
1

Use array_filter with the use keyword and a pass by reference array.

>>> $data
=> [
       [
           "company" => "Company A",
           "region"  => "EMEA",
           "ctype"   => "Customer"
       ],
       [
           "company" => "Company A",
           "region"  => "EMEA",
           "ctype"   => "Customer"
       ],
       [
           "company" => "Company C",
           "region"  => "EMEA",
           "ctype"   => "Customer"
       ]
   ]
$whitelist = [];

array_filter($data, function ($item) use (&$whitelist) { 
  if (!in_array($item['company'], $whitelist)) { 
    $whitelist[] = $item['company']; 
    return true; 
  }; 
  return false; 
});

=> [
       0 => [
           "company" => "Company A",
           "region"  => "EMEA",
           "ctype"   => "Customer"
       ],
       2 => [
           "company" => "Company C",
           "region"  => "EMEA",
           "ctype"   => "Customer"
       ]
   ]
markcial
  • 9,041
  • 4
  • 31
  • 41
  • Whilst I follow the concept of this, I'm not getting the right output. The returned data is: array(2) { [0]=> string(9) "Company A" [1]=> string(9) "Company C" } – Sheixt Jan 13 '15 at 12:57
  • Due to the PHP version running on the server I had to amend `$whitelist = [];` to `$whitelist = array();` I assume this has no impact? – Sheixt Jan 13 '15 at 13:01
  • not at all, is just the same object but with different styles of declaration. – markcial Jan 14 '15 at 10:26
0

To rebuild an array without duplicates :

$result = array();
foreach($datas as $data){
  foreach($data as $key => $value){
    $result[$key][$value] = $value;
  }
}

print_r($result);

OUTPUT :

Array
(
    [company] => Array
        (
            [Company A] => Company A
            [Company C] => Company C
        )

    [region] => Array
        (
            [EMEA] => EMEA
        )

    [ctype] => Array
        (
            [Customer] => Customer
        )

)

Keeping the same architecture :

$datas = array(
  array(
    "company"=>"Company A",
    "region"=>"EMEA",
    "ctype"=>"Customer"
  ),
  array(
    "company"=>"Company A",
    "region"=>"EMEA",
    "ctype"=>"Customer"
  ),
  array(
    "company"=>"Company C",
    "region"=>"EMEA",
    "ctype"=>"Customer"
  )
);

function removeDuplicateOnField($datas, $field){
  $result = array();

  foreach($datas as $key => &$data){
      if(isset($data[$field]) AND !isset($result[$data[$field]])){
        $result[$data[$field]] = $data;
      }
      else 
        unset($datas[$key]);
  }
  return $datas;
}

$result = removeDuplicateOnField($datas, "company");

print_r($result);
Spoke44
  • 968
  • 10
  • 24
  • Ah, I'm looking to keep the same data structure for the output as was inserted (i.e. as the same as in the first code snippet in the question). This is because I will need to manipulate the data with the associations that exist. – Sheixt Jan 13 '15 at 13:12
0

What you seem to be describing is something PHP can already cater for. Have you heard of the array_unique function before? It doesn't work recursively, but while browsing through the PHP docs someone has already created a function which will work.

recursive array unique for multiarrays

function super_unique($array)
{
  $result = array_map("unserialize", array_unique(array_map("serialize", $array)));

  foreach ($result as $key => $value)
  {
    if ( is_array($value) )
    {
      $result[$key] = super_unique($value);
    }
  }

  return $result;
}

Let me know if this works, as I am currently out the office at the moment.

GNewton
  • 90
  • 14
  • This isn't far away. It seems to de-dupe, however there is still an erroneous (empty) array at the end of the data: ` array(3) { ["company"]=> string(9) "Company A" ["region"]=> string(4) "EMEA" ["ctype"]=> string(8) "Customer" } array(3) { ["company"]=> string(9) "Company C" ["region"]=> string(4) "EMEA" ["ctype"]=> string(8) "Customer" } array(2) { [0]=> NULL [2]=> NULL }` – Sheixt Jan 13 '15 at 13:09
  • It was a 'shot in the dark' sort of thing while i was browsing through on my laptop, as I wasn't at my normal development machine. I see in the comments above, you have found your answer. Happy coding! – GNewton Jan 13 '15 at 19:21