24

I have an array

Array(
[0] => Array
    (
        [0] => 33
        [user_id] => 33
        [1] => 3
        [frame_id] => 3
    )

[1] => Array
    (
        [0] => 33
        [user_id] => 33
        [1] => 3
        [frame_id] => 3
    )

[2] => Array
    (
        [0] => 33
        [user_id] => 33
        [1] => 8
        [frame_id] => 8
    )

[3] => Array
    (
        [0] => 33
        [user_id] => 33
        [1] => 3
        [frame_id] => 3
    )

[4] => Array
    (
        [0] => 33
        [user_id] => 33
        [1] => 3
        [frame_id] => 3
    )

)

As you can see key 0 is the same as 1, 3 and 4. And key 2 is different from them all.

When running the array_unique function on them, the only left is

Array (
[0] => Array
    (
        [0] => 33
        [user_id] => 33
        [1] => 3
        [frame_id] => 3
    )
)

Any ideas why array_unique isn't working as expected?

halfer
  • 19,824
  • 17
  • 99
  • 186
dotty
  • 40,405
  • 66
  • 150
  • 195

7 Answers7

85

It's because array_unique compares items using a string comparison. From the docs:

Note: Two elements are considered equal if and only if (string) $elem1 === (string) $elem2. In words: when the string representation is the same. The first element will be used.

The string representation of an array is simply the word Array, no matter what its contents are.

You can do what you want to do by using the following:

$arr = array(
    array('user_id' => 33, 'frame_id' => 3),
    array('user_id' => 33, 'frame_id' => 3),
    array('user_id' => 33, 'frame_id' => 8)
);

$arr = array_intersect_key($arr, array_unique(array_map('serialize', $arr)));

//result:
array
  0 => 
    array
      'user_id' => int 33
      'user' => int 3
  2 => 
    array
      'user_id' => int 33
      'user' => int 8

Here's how it works:

  1. Each array item is serialized. This will be unique based on the array's contents.

  2. The results of this are run through array_unique, so only arrays with unique signatures are left.

  3. array_intersect_key will take the keys of the unique items from the map/unique function (since the source array's keys are preserved) and pull them out of your original source array.

ryeguy
  • 65,519
  • 58
  • 198
  • 260
  • Wish I could +1 this twice. Beautiful. – Ben Sep 19 '11 at 08:20
  • 1
    Could you quickly elaborate how complex this approach is. For example if the have n items in the array "arr" and each item has m attributes. I just would like to know if it scales for my applicaton, where if have about 10 to 50 items with about 5 to 15 propertys each. – Pascal Klein Mar 15 '13 at 10:02
  • @ryeguy I think this is going to make all of my hopes and dreams come true! – Jen Born Feb 19 '14 at 18:06
  • @PascalKlein - that's a good question and I'm disappointed it isn't answered. This solution works but serializing every member of the array is going to scale poorly for large arrays / arrays with large sub-arrays. Depending on your specific situation, though, this may be the only viable solution. In my case I was able to simplify by making some assumptions (if $a['id'] === $b['id'] then assume $a === $b) but otherwise mimicking this logic. That is, I replaced ```'serialize'``` with my own callback that just returns ```$arg['id']```, which is much faster than ```serialize``` would've been. – Mark May 09 '16 at 20:51
  • Note that sometimes `json_encode` might be faster, so check this too. – PeterM Dec 08 '16 at 13:02
  • Really like this one, but found out that the removed entries still exist as NULL if you loop over. solved this by putting an array_values() around the code. – Dxg125 May 05 '21 at 08:54
7

Here's an improved version of @ryeguy's answer:

<?php

$arr = array(
    array('user_id' => 33, 'tmp_id' => 3),
    array('user_id' => 33, 'tmp_id' => 4),
    array('user_id' => 33, 'tmp_id' => 5)
);


# $arr = array_intersect_key($arr, array_unique(array_map('serialize', $arr)));
$arr = array_intersect_key($arr, array_unique(array_map(function ($el) {
    return $el['user_id'];
}, $arr)));

//result:
array
  0 => 
    array
      'user_id' => int 33
      'tmp_id' => int 3

First, it doesn't do unneeded serialization. Second, sometimes attributes may be different even so id is the same.

The trick here is that array_unique() preserves the keys:

$ php -r 'var_dump(array_unique([1, 2, 2, 3]));'
array(3) {
  [0]=>
  int(1)
  [1]=>
  int(2)
  [3]=>
  int(3)
}

This let's array_intersect_key() leave the desired elements.

I've run into it with Google Places API. I was combining results of several requests with different type of objects (think tags). But I got duplicates, since an object may be put into several categories (types). And the method with serialize didn't work, since the attrs were different, namely, photo_reference and reference. Probably these are like temporary ids.

x-yuri
  • 16,722
  • 15
  • 114
  • 161
  • Didn't think you could improve that but you DID! – self.name Aug 18 '21 at 19:01
  • @x-yuri: can you explain why array_intersect_key works this way? intersecting a simple array Array ( [0] => 33 ) in combination with array of array, i.e. $arr = array( array('user_id' => 33, 'tmp_id' => 3), ... );? somehow it will match those 33 even when there is a type mismatch (and other level) – Mat90 Dec 21 '21 at 19:40
  • @Mat90 `var_dump(array_intersect([33], [['user_id' => 33, 'tmp_id' => 3]]));`? It gives me an empty array. But this way, `var_dump(array_intersect(['Array'], [[]]));`, there is a match, because... because it's php :) Well, I didn't mean it seriously. Joking aside, because of the way it [compares the elements](https://www.php.net/array_intersect#refsect1-function.array-intersect-notes). If you give an example where there is a match, I would probably be able to tell the reason. You might want to specify your version. Also, try raising the error reporting level, if not at max. That might help. – x-yuri Dec 21 '21 at 20:06
  • @x-yuri, it is your example in steps: array_map(...) yields (using print_r): Array ( [0] => 33 [1] => 33 [2] => 33 ), array_unique(..): Array ( [0] => 33 ) and print_r($arr): Array ( [0] => Array ( [user_id] => 33 [tmp_id] => 3 ) [1] => Array ( [user_id] => 33 [tmp_id] => 4 ) [2] => Array ( [user_id] => 33 [tmp_id] => 5 ) ) then the final will be Array ( [0] => Array ( [user_id] => 33 [tmp_id] => 3 ) ). What kind of magic is happening here? It is somehow deciding to go a level deeper to compare (I think string compare would go awry as the second starts with Array [vs 33] as first element)? – Mat90 Dec 21 '21 at 20:34
  • @Mat90 I think I now see what confuses you. I've updated the answer, check it out. Do note that the keys of the array returned by `array_unique()` are not `0, 1, 2`. They are `0, 1, 3`. This let's `array_intersect_key()` to return the desired elements. – x-yuri Dec 22 '21 at 07:04
3

array_unique() only supports multi-dimensional arrays in PHP 5.2.9 and higher.

Instead, you can create a hash of the array and check it for unique-ness.

$hashes = array(); 

foreach($array as $val) { 
    $hashes[md5(serialize($val))] = $val; 
} 

array_unique($hashes);
PeterM
  • 1,478
  • 1
  • 22
  • 28
Amy B
  • 17,874
  • 12
  • 64
  • 83
  • the serialized string should be enough for comparing the strings. when using md5 with long serialized strings, you risk collision. Also, I'd use array_map with un/serialize before and after unique. – Gordon Apr 01 '10 at 15:02
  • @Gordon collisions are so improbable that it isn't even worth worrying about. – ryeguy Apr 01 '10 at 15:06
  • 1
    @ryeguy that depends on your application. and also doesn't change that it's unnecessary in the first place. – Gordon Apr 01 '10 at 15:07
  • 1
    There is no need to call `array_unique` using this concept, as when building `$hashes` this will be already unique, better use `array_values` to get integer based array. – PeterM Dec 08 '16 at 13:07
2

array_unique deosn't work recursive, so it just thinks "this are all Arrays, let's kill all but one... here we go!"

oezi
  • 51,017
  • 10
  • 98
  • 115
1

Quick Answer (TL;DR)

  • Distinct values may be extracted from PHP Array of AssociativeArrays using foreach
  • This is a simplistic approach

Detailed Answer

Context

  • PHP 5.3
  • PHP Array of AssociativeArrays (tabluar composite data variable)
  • Alternate name for this composite variable is ArrayOfDictionary (AOD)

Problem

  • Scenario: DeveloperMarsher has a PHP tabular composite variable
    • DeveloperMarsher wishes to extract distinct values on a specific name-value pair
    • In the example below, DeveloperMarsher wishes to get rows for each distinct fname name-value pair

Solution

  • example01 ;; DeveloperMarsher starts with a tabluar data variable that looks like this

    $aodtable = json_decode('[
    {
      "fname": "homer"
      ,"lname": "simpson"
    },
    {
      "fname": "homer"
      ,"lname": "jackson"
    },
    {
      "fname": "homer"
      ,"lname": "johnson"
    },
    {
      "fname": "bart"
      ,"lname": "johnson"
    },
    {
      "fname": "bart"
      ,"lname": "jackson"
    },
    {
      "fname": "bart"
      ,"lname": "simpson"
    },
    {
      "fname": "fred"
      ,"lname": "flintstone"
    }
    ]',true);
    
  • example01 ;; DeveloperMarsher can extract distinct values with a foreach loop that tracks seen values

    $sgfield  =   'fname';
    $bgnocase =   true;
    
    //
    $targfield  =   $sgfield;
    $ddseen     =   Array();
    $vout       =   Array();
    foreach ($aodtable as $datarow) {
    if( (boolean) $bgnocase == true ){ @$datarow[$targfield] = @strtolower($datarow[$targfield]); }
    if( (string) @$ddseen[ $datarow[$targfield] ] == '' ){
      $rowout   = array_intersect_key($datarow, array_flip(array_keys($datarow)));
      $ddseen[ $datarow[$targfield] ] = $datarow[$targfield];
      $vout[] = Array( $rowout );
    }
    }
    //;;
    
    print var_export( $vout, true );
    

Output result

array (
  0 =>
  array (
    0 =>
    array (
      'fname' => 'homer',
      'lname' => 'simpson',
    ),
  ),
  1 =>
  array (
    0 =>
    array (
      'fname' => 'bart',
      'lname' => 'johnson',
    ),
  ),
  2 =>
  array (
    0 =>
    array (
      'fname' => 'fred',
      'lname' => 'flintstone',
    ),
  ),
)

Pitfalls

  • This solution does not aggregate on fields that are not part of the DISTINCT operation
  • Arbitrary name-value pairs are returned from arbitrarily chosen distinct rows
  • Arbitrary sort order of output
  • Arbitrary handling of letter-case (is capital A distinct from lower-case a ?)

See also

  • php array_intersect_key
  • php array_flip
dreftymac
  • 31,404
  • 26
  • 119
  • 182
1
function array_unique_recursive($array)
{
    $array = array_unique($array, SORT_REGULAR);

    foreach ($array as $key => $elem) {
        if (is_array($elem)) {
            $array[$key] = array_unique_recursive($elem);
        }
    }

    return $array;
}

Doesn't that do the trick ?

jav974
  • 1,022
  • 10
  • 22
0
`

    $arr = array(
        array('user_id' => 33, 'tmp_id' => 3),
        array('user_id' => 33, 'tmp_id' => 4),
        array('user_id' => 33, 'tmp_id' => 3),
        array('user_id' => 33, 'tmp_id' => 4),
    );
    $arr1 = array_unique($arr,SORT_REGULAR);
    echo "<pre>";
    print_r($arr1);
    echo "</pre>";
   Array(   
        [0] => Array(
                    [user_id] => 33
                    [tmp_id] => 3
        )
        [1] => Array(
                     [user_id] => 33
                     [tmp_id] => 4
          )
        )
    

`