8

I have two arrays containing repeating values:

$test1 = [
    "blah1",
    "blah1",
    "blah1",
    "blah1",
    "blah2"
];

$test2 = [
    "blah1",
    "blah1",
    "blah1",
    "blah2"
];

I am trying to get array difference:

$result = array_diff($test1,$test2);

echo "<pre>";
print_r($result);

I need it to return array with single value blah1, yet it returns empty array instead.

I suspect it has something to do with fact there are duplicate values in both arrays, but not sure how to fix it.

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
Acidon
  • 1,294
  • 4
  • 23
  • 44
  • 1
    Your solution is good, but it will fails if you have `$array1 = [ 'a', 'b', 'c' ]` and `$array2 = [ 'd' ]`. The output should be the same as `$array1`, but will be `[ 'b', 'c' ]` because the `array_search()` will returns `false` when looking for `d`, and the `unset()` will drop the first key of `$array1` because `false == 0`. An `if` should helps on it ([gist](https://gist.github.com/rentalhost/e37628db9b3dc8e737c6b9153d617200), [run](https://3v4l.org/96frT)). – David Rodrigues Oct 15 '20 at 16:17
  • @Acidon I think you should add your own solution as an answer, cause I haven't found a better way to do it. – Jules Colle Dec 12 '20 at 15:15

3 Answers3

4

array_diff compares the first array to the other array(s) passed as parameter(s) and returns an array, containing all the elements present in the first array that are not present in any other arrays. Since $test1 and $test2 both contain "blah1" and "blah2", and no other values, actually, the expected behavior of array_diff is the one that you have experienced, that is, to return an empty array, since, there is no element in $test1 which is not present in $test2.

Further read. Also, read some theory to understand what you are working with.

Lajos Arpad
  • 64,414
  • 37
  • 100
  • 175
2

Spotted a problem with Acidon's own solution. The problem comes from the fact that unset($array[false]) will actually unset $array[0], so there needs to be an explicit check for false (as David Rodrigues pointed out as well.)

function subtract_array($array1,$array2){
    foreach ($array2 as $item) {
        $key = array_search($item, $array1);
        if ( $key !== false ) {
            unset($array1[$key]);
        }
    }
    return array_values($array1);
}

Some examples

subtract_array([1,1,1,2,3],[1,2]);            // [1,1,3]
subtract_array([1,2,3],[4,5,6]);              // [1,2,3]
subtract_array([1,2,1],[1,1,2]);              // []
subtract_array([1,2,3],[]);                   // [1,2,3]
subtract_array([],[1,1]);                     // []
subtract_array(['hi','bye'], ['bye', 'bye']); // ['hi']
Jules Colle
  • 11,227
  • 8
  • 60
  • 67
0

Depending on the scope of your task, it may be necessary to only remove elements from the first array which are "one-for-one" represented in the second array. In other cases, it may be appropriate to cross-check the differences in a "one-for-one" manner for both arrays and combine the remaining elements.

Consider this altered sample data set:

$test1 = [
    "blah1",
    "blah1",
    "blah2",
    "blah4",
    "blah5"
];

$test2 = [
    "blah1", // under-represented
    "blah2", // equally found
    "blah3", // not found
    "blah4", // over-represented
    "blah4", //       "
];

Below are four different functions (with indicative names) to offer varied utility.

Codes: (Demo)

  • unilateral difference (iterated array searches):

    function removeBValuesFromA(array $a, array $b): array
    {
        foreach ($b as $bVal) {
            $k = array_search($bVal, $a);
            if ($k !== false) {
                unset($a[$k]);
            }
        }
        return array_values($a);
    }
    
  • bilateral difference (iterated array searches):

    function bidirectionalDiff(array $a, array $b): array
    {
        foreach ($b as $bKey => $bVal) {
            $aKey = array_search($bVal, $a);
            if ($aKey !== false) {
                unset($a[$aKey], $b[$bKey]);
            }
        }
        return array_merge($a, $b);
    }
    
  • unilateral difference (condense-compare-expand):

    function removeBValuesFromAViaCounts(array $a, array $b): array
    {
        $toRemove = array_count_values($b);
    
        $result = [];
        foreach (array_count_values($a) as $k => $count) {
            array_push(
                $result,
                ...array_fill(
                    0,
                    max(0, $count - ($toRemove[$k] ?? 0)),
                    $k
                )
            );
        }
        return $result;
    }
    
  • bilateral difference (condense-compare-expand):

    function bidirectionalDiffViaCounts(array $a, array $b): array
    {
        $bCounts = array_count_values($b);
    
        $result = [];
        foreach (array_count_values($a) as $k => $count) {
            array_push(
                $result,
                ...array_fill(
                    0,
                    abs($count - ($bCounts[$k] ?? 0)),
                    $k
                )
            );
            unset($bCounts[$k]);
        }
        foreach ($bCounts as $k => $count) {
            array_push(
                $result,
                ...array_fill(0, $count, $k)
            );
        }
        return $result;
    }
    

Execution:

var_export([
    'removeBValuesFromA' => removeBValuesFromA($test1, $test2),
    'bidirectionalDiff' => bidirectionalDiff($test1, $test2),
    'removeBValuesFromAViaCounts' => removeBValuesFromAViaCounts($test1, $test2),
    'bidirectionalDiffViaCounts' => bidirectionalDiffViaCounts($test1, $test2),
]);

Outputs:

array (
  'removeBValuesFromA' => 
  array (
    0 => 'blah1',
    1 => 'blah5',
  ),
  'bidirectionalDiff' => 
  array (
    0 => 'blah1',
    1 => 'blah5',
    2 => 'blah3',
    3 => 'blah4',
  ),
  'removeBValuesFromAViaCounts' => 
  array (
    0 => 'blah1',
    1 => 'blah5',
  ),
  'bidirectionalDiffViaCounts' => 
  array (
    0 => 'blah1',
    1 => 'blah4',
    2 => 'blah5',
    3 => 'blah3',
  ),
)
mickmackusa
  • 43,625
  • 12
  • 83
  • 136