89

I'm sure this is an extremely obvious question, and that there's a function that does exactly this, but I can't seem to find it. In PHP, I'd like to know if my array has duplicates in it, as efficiently as possible. I don't want to remove them like array_unique does, and I don't particularly want to run array_unique and compare it to the original array to see if they're the same, as this seems very inefficient. As far as performance is concerned, the "expected condition" is that the array has no duplicates.

I'd just like to be able to do something like

if (no_dupes($array))
    // this deals with arrays without duplicates
else
    // this deals with arrays with duplicates

Is there any obvious function I'm not thinking of?
How to detect duplicate values in PHP array?
has the right title, and is a very similar question, however if you actually read the question, he's looking for array_count_values.

dreftymac
  • 31,404
  • 26
  • 119
  • 182
Mala
  • 14,178
  • 25
  • 88
  • 119
  • Do you just want to know if there are any duplicates or the quantity and value of said duplicates etc? –  Jun 29 '10 at 23:56
  • 1
    I only need to know if there are any duplicates. Returning a boolean is perfect. – Mala Jun 29 '10 at 23:57
  • 23
    Honestly I think `if(count($array) == count(array_unique($array)))` is the best you can get. You have to traverse the array this way or another and I think the built-in are optimized for that. `array_flip` could be considered too. – Felix Kling Jun 30 '10 at 00:02
  • @Felix, you can do better than that. That does three loops, one to create the unique array, one to count it, and one to count the original. – Mike Sherov Jun 30 '10 at 00:17
  • @Mike Sherov: Are you sure? I couldn't find anything about it, but I had hoped that PHP arrays have some internal property that keeps track of the length. Do you have an information about this? I would be very interested. – Felix Kling Jun 30 '10 at 07:46
  • @Felix, I was always taught that count was an expensive operation in PHP, and that it required looping through. Maybe that's wrong. – Mike Sherov Jun 30 '10 at 11:03
  • @Felix, have a look at these: http://maettig.com/code/php/php-performance-benchmarks.php http://josephscott.org/archives/2010/01/php-count-performance/ http://mikegerwitz.com/2010/03/28/php-performance-array-iteration/ I'm not really sure where that leaves us. Yes, doing count() multiples times is slow, but it may well be faster than my answer. – Mike Sherov Jun 30 '10 at 12:29

17 Answers17

255

I know you are not after array_unique(). However, you will not find a magical obvious function nor will writing one be faster than making use of the native functions.

I propose:

function array_has_dupes($array) {
   // streamline per @Felix
   return count($array) !== count(array_unique($array));
}

Adjust the second parameter of array_unique() to meet your comparison needs.

Jason McCreary
  • 71,546
  • 23
  • 135
  • 174
  • 3
    thanks for the suggestion. My thought at finding a better algorithm is simply that, technically speaking, once you've finished running whatever the builtin `array_unique` does, you should be able to know if there were any dupes. Thus, anything that does at least as much work as `array_unique` does more work than necessary. Although yeah, if such a function doesn't exist, I don't particularly feel like writing it. – Mala Jun 30 '10 at 00:10
  • 1
    If you only care about if it has dupes, then that's what I would do. If you care about more than just if it has dupes, then you're right, the above may do more work than it needs. Anything you write is going to be O(n^2). Even if you bail out early. As you said, it's not common that you have dupes. So is it worth your time to make something magical? – Jason McCreary Jun 30 '10 at 00:13
  • Magical? Sure it's a microoptimization, but it's not "magic" to write your own function, and I'm not sure it's that a better solution is that much harder to write than this. – Mike Sherov Jun 30 '10 at 00:32
  • @Mike, never said anything of the sort. Just questioning if it was worth the developer's time in this particular instance based on the specification of the original question. No worries. – Jason McCreary Jun 30 '10 at 00:38
  • @Jason, I wasn't trying to come off harsh. I was just trying to convey that it doesn't always hurt to seek your own solution when you already know the easy way. – Mike Sherov Jun 30 '10 at 00:49
  • @Mike. Agreed. And ultimately it's how we all get better. I was just devil's advocate, in this case, saying that there's nothing wrong with the *easy way*. – Jason McCreary Jun 30 '10 at 01:22
  • @Jason, yup. I too have been bitten before by premature optimization. – Mike Sherov Jun 30 '10 at 02:19
  • @JasonMcCreary I think you meant to say O(n log n) as the simple solution is just sort the array which is O(n log n) and then run through it once to find duplicates. – Alan Turing Oct 24 '14 at 23:23
  • 1
    I came here only to find exactly this answer:) – Gino Pane Feb 17 '15 at 13:29
  • You should give us some clue what the second parameter does. – aksu Feb 27 '15 at 15:55
  • 5
    Elegant, but `array_unique` is somewhat slow. If you know the array to only contain integers and strings, you can replace it with `array_flip` for much faster results. – Tgr Oct 16 '15 at 00:46
104

Performance-Optimized Solution

If you care about performance and micro-optimizations, check this one-liner:

function no_dupes(array $input_array) {
    return count($input_array) === count(array_flip($input_array));
}

Description:
Function compares number of array elements in $input_array with array_flip'ed elements. Values become keys and guess what - keys must be unique in associative arrays so not unique values are lost and final number of elements is lower than original.

Warning:
As noted in the manual, array keys can be only type of int or string so this is what you must have in original array values to compare, otherwise PHP will start casting with unexpected results. See https://3v4l.org/7bRXI for an example of this fringe-case failure mode.

Proof for an array with 10 million records:

Test case:

<?php

$elements = array_merge(range(1,10000000),[1]);

$time = microtime(true);
accepted_solution($elements);
echo 'Accepted solution: ', (microtime(true) - $time), 's', PHP_EOL;

$time = microtime(true);
most_voted_solution($elements);
echo 'Most voted solution: ', (microtime(true) - $time), 's', PHP_EOL;

$time = microtime(true);
this_answer_solution($elements);
echo 'This answer solution: ', (microtime(true) - $time), 's', PHP_EOL;

function accepted_solution($array){
 $dupe_array = array();
 foreach($array as $val){
  // sorry, but I had to add below line to remove millions of notices
  if(!isset($dupe_array[$val])){$dupe_array[$val]=0;}
  if(++$dupe_array[$val] > 1){
   return true;
  }
 }
 return false;
}

function most_voted_solution($array) {
   return count($array) !== count(array_unique($array));
}

function this_answer_solution(array $input_array) {
    return count($input_array) === count(array_flip($input_array));
}

Notice that accepted solution might be faster in certain condition when not unique values are near the beginning of huge array.

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
s3m3n
  • 4,187
  • 1
  • 28
  • 24
  • Can you explain why is this faster? Also this returns the contrary. So to have a fair comparison you should test with : `function most_voted_solution($array) { return count($array) === count(array_unique($array)); }` – Erdal G. Nov 07 '20 at 15:38
  • 2
    @ErdalG. this is faster because `array_flip` is [native PHP function written in C](https://github.com/php/php-src/blob/9426c6e/ext/standard/array.c#L4340) and flip is pretty simple operation. After flipping not unique values are removed as they could create array key conflict. – s3m3n Nov 08 '20 at 16:15
41

You can do:

function has_dupes($array) {
    $dupe_array = array();
    foreach ($array as $val) {
        if (++$dupe_array[$val] > 1) {
            return true;
        }
    }
    return false;
}
JustCarty
  • 3,839
  • 5
  • 31
  • 51
Mike Sherov
  • 13,277
  • 8
  • 41
  • 62
  • 7
    I like it! Just keep in mind that even with an early `return` this is an O(n) function. In addition to the overhead of `foreach` and tracking `$dupe_array`, I'd love to see some benchmarking. I'd guess for array's with no duplicates, utilizing native functions is faster. Definitely better than O(n^2) though. Nice. – Jason McCreary Jun 30 '10 at 01:20
  • 2
    Has a little problem: only works correctly if the values are strings or numbers. – Artefacto Jun 30 '10 at 04:29
  • 10
    This code gave me an `undefined offset` error in PHP. Instead, I did: `foreach ( $a as $v ) { if ( array_key_exists($v,$dupe) { return true; } else { $dupe[$v] = true; }` – EleventyOne Dec 28 '13 at 02:04
  • 3
    How does this even work? Since `$dupe_array` has not been defined with any values, `$dupe_array[$val]` should return an undefined index! – Nikunj Madhogaria Sep 13 '15 at 06:42
  • what does ++dupe_array[$val] means? and isn't it suppose to be ++dupe_arrays[$key] because simply $val is not $key? – Salam.MSaif Nov 28 '17 at 02:26
  • The idea behind `++dupe_array[$val]` is to count how often each array value appears. But since the values are not initialized, this does not work without notices (in PHP 7.1). However, if you want to keep the counting behaviour, you can fix the code by adding `if (!isset($dupe_array[$val])) {$dupe_array[$val] = 0}` before the current `if`-block. – Milania Feb 28 '18 at 10:22
23
$hasDuplicates = count($array) > count(array_unique($array)); 

Will be true if duplicates, or false if no duplicates.

Andrew
  • 18,680
  • 13
  • 103
  • 118
  • This is pretty much a retread of @JasonMcCreary's answer. https://stackoverflow.com/a/3145647/2943403 – mickmackusa Jan 10 '21 at 01:21
  • But It is throwing duplicate values error even when array has empty values. I have posted answer https://stackoverflow.com/questions/3145607/php-check-if-an-array-has-duplicates/67122587#67122587 below – Prasad Patel Apr 16 '21 at 09:21
5

Here's my take on this… after some benchmarking, I found this to be the fastest method for this.

function has_duplicates( $array ) {
    return count( array_keys( array_flip( $array ) ) ) !== count( $array );
}

…or depending on circumstances this could be marginally faster.

function has_duplicates( $array ) {
    $array = array_count_values( $array );
    rsort( $array );
    return $array[0] > 1;
}
micadelli
  • 2,482
  • 6
  • 26
  • 38
  • 2
    Not sure why you need `array_keys()` in your answer. `array_flip()` already condenses your array if the values are the same. Also, `!=` is a sufficient compararor, since the types are inherently the same out of `count()` (you're the one that mentioned benchmarking). Therefore `return count(array_flip($arr)) != count($arr);` should be sufficient. – cartbeforehorse Nov 13 '14 at 12:47
  • The techniques in this answer have the same vulnerabilities as @s3m3n's function. https://3v4l.org/3FlBJ This is an "apples-vs-oranges" comparison, so I'll argue that any benchmark comparisons are inappropriate because the function do not offer identical behavior. – mickmackusa Jan 10 '21 at 00:55
5
$duplicate = false;

 if(count(array) != count(array_unique(array))){
   $duplicate = true;
}
Ankita Mehta
  • 442
  • 3
  • 16
1

To remove all the empty values from the comparison you can add array_diff()

if (count(array_unique(array_diff($array,array("")))) < count(array_diff($array,array(""))))

Reference taken from @AndreKR answer from here

Prasad Patel
  • 707
  • 3
  • 16
  • 53
1

Keep it simple, silly! ;)

Simple OR logic...

function checkDuplicatesInArray($array){
    $duplicates=FALSE;
    foreach($array as $k=>$i){
        if(!isset($value_{$i})){
            $value_{$i}=TRUE;
        }
        else{
            $duplicates|=TRUE;          
        }
    }
    return ($duplicates);
}

Regards!

웃웃웃웃웃
  • 11,829
  • 15
  • 59
  • 91
  • 3
    #BadCode - The best ways to do this check with functions of PHP itself. – FabianoLothor Oct 26 '12 at 13:56
  • I find variable variables to generally unattractive solution. This technique may fail in certain scenarios. https://3v4l.org/kGLWT Moreso, from PHP7.4 and up. – mickmackusa Jan 10 '21 at 01:02
0

Find this useful solution

function get_duplicates( $array ) {
    return array_unique( array_diff_assoc( $array, array_unique( $array ) ) );
}

After that count result if greater than 0 than duplicates else unique.

Muhammad Raheel
  • 19,823
  • 7
  • 67
  • 103
  • Despite being a one-liner, this technique appears to be doing more processing than other posted answers. To check if an array is empty without calling `count()`, just do a falsey check using `!`: https://3v4l.org/O4g3F – mickmackusa Jan 10 '21 at 01:11
0

I'm using this:

if(count($array)==count(array_count_values($array))){
    echo("all values are unique");
}else{
    echo("there's dupe values");
}

I don't know if it's the fastest but works pretty good so far

Abraham Romero
  • 1,047
  • 11
  • 22
  • Some data types will cause this technique to fail, so this is not a reliable/robust solution. https://3v4l.org/FSr7P – mickmackusa Jan 10 '21 at 01:05
0

Two ways to do it efficiently that I can think of:

  1. inserting all the values into some sort of hashtable and checking whether the value you're inserting is already in it(expected O(n) time and O(n) space)

  2. sorting the array and then checking whether adjacent cells are equal( O(nlogn) time and O(1) or O(n) space depending on the sorting algorithm)

stormdrain's solution would probably be O(n^2), as would any solution which involves scanning the array for each element searching for a duplicate

Bwmat
  • 4,314
  • 3
  • 27
  • 42
0

One more solution from me, this is related to performance improvement

$array_count_values = array_count_values($array);
if(is_array($array_count_values) && count($array_count_values)>0)
{
   foreach ($array_count_values as $key => $value)
   {
      if($value>1)
      {
        // duplicate values found here, write code to handle duplicate values            
      }
   }
}
Prasad Patel
  • 707
  • 3
  • 16
  • 53
-1

Php has an function to count the occurrences in the array http://www.php.net/manual/en/function.array-count-values.php

mazgalici
  • 608
  • 6
  • 10
-1

As you specifically said you didn't want to use array_unique I'm going to ignore the other answers despite the fact they're probably better.

Why don't you use array_count_values() and then check if the resulting array has any value greater than 1?

-1

You can do it like that way also: This will return true if unique else return false.

$nofollow = (count($modelIdArr) !== count(array_unique($modelIdArr))) ? true : false;
Lakhan
  • 12,328
  • 3
  • 19
  • 28
  • This is pretty much a retread of @JasonMcCreary's answer. https://stackoverflow.com/a/3145647/2943403 I have voted to delete this post. – mickmackusa Jan 10 '21 at 01:21
-1

The simple solution but quite faster.

$elements = array_merge(range(1,10000000),[1]);

function unique_val_inArray($arr) {
    $count = count($arr);
    foreach ($arr as $i_1 => $value) {
        for($i_2 = $i_1 + 1; $i_2 < $count; $i_2++) {
            if($arr[$i_2] === $arr[$i_1]){
                return false;
            }
        }
    }
    return true;
}

$time = microtime(true);
unique_val_inArray($elements);
echo 'This solution: ', (microtime(true) - $time), 's', PHP_EOL;

Speed - [0.71]!

-1
function hasDuplicate($array){
  $d = array();
  foreach($array as $elements) {
    if(!isset($d[$elements])){
      $d[$elements] = 1;
    }else{
      return true;
    } 
  } 
  return false;
}
lilHar
  • 1,735
  • 3
  • 21
  • 35
Alo
  • 1