1

I have a 2d array containing rows where a user might be represented more than once. I need to remove duplicate instances of user data, but I don't want to lose any meaningful data in the process. Rows with a non-empty value in the Flying Tour column should be prioritized over a row with an empty value in the same column.

Sample data:

$data = [
    [
        'Access ID' => 12345,
        'Registration Date' => '2018-02-27',
        'First Name' => 'Damian',
        'Last Name' => 'Martin',
        'Flying Tour' => ''
    ],
    [
        'Access ID' => 12345,
        'Registration Date' => '2018-02-27',
        'First Name' => 'Damian',
        'Last Name' => 'Martin',
        'Flying Tour' => 'Yes going'
    ],
    [
        'Access ID' => 789456,
        'Registration Date' => '2018-03-27',
        'First Name' => 'Ricky',
        'Last Name' => 'Smith',
        'Flying Tour' => ''
    ],
    [
        'Access ID' => 789456,
        'Registration Date' => '2018-03-27',
        'First Name' => 'Ricky',
        'Last Name' => 'Smith',
        'Flying Tour' => 'Two way going',
    ],
    [
        'Access ID' => 987654,
        'Registration Date' => '2018-04-27',
        'First Name' => 'Darron',
        'Last Name' => 'Butt',
        'Flying Tour' => ''
    ]
];

My code:

$results = [];
      foreach($data as $input){

      $isDuplicate = false;
      foreach($results as $result){
        if(
            strtolower($input['First Name'])===strtolower($result['First Name']) &&
            strtolower($input['Last Name'])===strtolower($result['Last Name'])      &&
            strtolower($input['Registration ID'])===strtolower($result['Registration ID']) &&
            strtolower(!empty($input['Flying Tour']))
        ){
            //a duplicate was found in results
            $isDuplicate = true;
            break;
        }
      }
      //if no duplicate was found
      if(!$isDuplicate) $results[]=$input;
}

Desired result:

Array
(
    [0] => Array
        (
            [Access ID] => 12345
            [Registration Date] => 2018-02-27
            [First Name] => Damian
            [Last Name] => Martin
            [Flying Tour] => Yes going
        )

    [1] => Array
        (
            [Access ID] => 789456
            [Registration Date] => 2018-03-27
            [First Name] => Ricky
            [Last Name] => Smith
            [Flying Tour] => Two way going
        )

    [2] => Array
        (
            [Access ID] => 987654
            [Registration Date] => 2018-04-27
            [First Name] => Darron
            [Last Name] => Butt
            [Flying Tour] => 
        )

)

Some changes are made please see

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
Saurav
  • 27
  • 5

7 Answers7

0

The point here is that for the first run, $results is still empty. Thus, the whole logic that can set $isDuplicate = true never runs for the first instance.

This means that for the first record, $isDuplicate = false and thus it will always be added.

Furthermore, strtolower(!empty($input['Flying Tour'])) is a bit weird... and should give an error in my opinion (even if it doesnt...)

You should move the !empty($input['Flying Tour']) outside the for loop, since it does not need the current values in the new array for its check. This will solve the problem.

if(!$isDuplicate && !empty($input['Flying Tour'])) $results[]=$input;

If you want to learn and improve your code, have a look at array_filter it is made for this purpose: filtering array results.

--edit

foreach($results as $id => $result){
    if(
        strtolower($input['First Name'])===strtolower($result['First Name']) &&
        strtolower($input['Last Name'])===strtolower($result['Last Name'])      &&
        strtolower($input['Registration ID'])===strtolower($result['Registration ID']))
    ){
        // Check if the $results value for Flying Tour is empty
        // if yes: update its value $results[ $id ]['Flying Tour'] = $input['Flying Tour'], but still mark as duplicate(!)

        //a duplicate was found in results
        $isDuplicate = true;
        break;
    }
  }
Jeffrey
  • 1,766
  • 2
  • 24
  • 44
  • if(!$isDuplicate && !empty($input['Flying Tour'])) $results[]=$input; was good but what happend it return only those value that having Flying Tour – Saurav Mar 26 '18 at 09:17
  • if there is no value it should return those value too – Saurav Mar 26 '18 at 09:18
  • ah, now I get it. What you want is to check on the 3 values, if they are the same as the one in $results already (you got that part). Then, you also want to check if the value in `$results` currently has no value for `Flying Tour`. If that is the case, you want to update that record with the new value in `$input`. Though, still mark that record as duplicate, else it will show up twice... – Jeffrey Mar 26 '18 at 09:36
  • Thank you all specially @Oluwafemi Sule – Saurav Mar 26 '18 at 10:02
0

Use foreach() along with array keys to check for duplicates:

$results = [];
foreach ($data as $input) {
    if (!isset($results[$input['Access ID'] . '_' . $input['First Name'] . '_' . $input['Last Name']])) {
            $results[$input['Access ID'] . '_' . $input['First Name'] . '_' . $input['Last Name']] = $input;
    } else {
        if ($results[$input['Access ID'] . '_' . $input['First Name'] . '_' . $input['Last Name']]['Flying Tour'] == '') {
            $results[$input['Access ID'] . '_' . $input['First Name'] . '_' . $input['Last Name']] = $input;
        }
    }
}

$results = array_values($results);
//array_multisort( array_column($results, "First Name"), SORT_ASC, $results );
echo "<pre/>";
print_r($results);

Sample Output: https://3v4l.org/2KcSN

Alive to die - Anant
  • 70,531
  • 10
  • 51
  • 98
0

Compose a key from values that should be unique.

If the composed key from each loop array doesn't exists in $results include it in the results.

Also if composed key exists and Flying Tour value is empty replace with new array.

$results = [];

foreach ($data as $input) {
    $key = implode([
        $input['First Name'], 
        $input['Last Name'], 
        $input['Access ID']
    ], '-');

    if (array_key_exists($key, $results) 
        && !empty($results[$key]['Flying Tour'])) continue;

    $results[$key] = $input;
}

var_dump(array_values($results));
Oluwafemi Sule
  • 36,144
  • 1
  • 56
  • 81
0

Uses foreach to iterate over $data. Storing duplicates info via $duplicates variable, and using it to remove duplicate values.

Alternative:

$duplicates = array();
$results = $data;

foreach( $results as $k=>$v ) {
    if( isset( $duplicates[$v['Access ID']] ) ) {
        unset( $results[$k] );
    }
    else $duplicates[$v['Access ID']] = TRUE;
}

print_r( $results );
Karlo Kokkak
  • 3,674
  • 4
  • 18
  • 33
0
echo '<pre>';
$data = array_reverse($data);
$combined = array_map(function($row){

    $row = array_map('trim', $row);
    unset($row['Flying Tour']);
    //print_r($row);exit;
    return array_reduce($row, function($v1,$v2){
        return $v1 . "|" . $v2;
    });
}, $data);

$uniqueElements  = array_unique($combined);

$uniqueIndexes = array_keys($uniqueElements);
$newData = [];

foreach ($data as $key => $value) {
    if(in_array($key, $uniqueIndexes)){
        $newData[] = $value;
    }
}

print_r($newData);
mickmackusa
  • 43,625
  • 12
  • 83
  • 136
Mihir Bhende
  • 8,677
  • 1
  • 30
  • 37
0
$arrayvalues = array();   // Set the array before foreach 
foreach( $res as $key=>$value ) {
    if( isset( $arrayvalues[$value['Access ID']] ) ) {   // Here checking the Array value and $value are same
        unset( $res[$key] );            // unset the array key value here
    }
    else $arrayvalues[$value['Access ID']] = $value['Access ID'];      // If it is not it will display here and returns true.
}


$i=0;
$testvalues = array();   // Set the array before foreach 
foreach ($res as $key => $value) {
    print_r($value);
    $arr[$i] = $value;
    $testvalues[$i]=$arr[$i];

    // unset($arr[$key]);
    $i++;
}

echo "<pre>";
print_r($testvalues) ;   // Get the respective out put here
echo "</pre>";
mickmackusa
  • 43,625
  • 12
  • 83
  • 136
Muthusamy
  • 306
  • 1
  • 11
0

To group by unique users, you only need to reference their unique access number. Use that value as grouping first-level keys in the result array. As you iterate, if there is no saved row for the access id, then save the row. If the row exists, but the Flying Tour value is empty, then save the current row in its place. When finished iterating, you can remove the associative first-level keys by calling array_values().

Code: (Demo)

$result = [];
foreach ($data as $row) {
    if (empty($result[$row['Access ID']]['Flying Tour'])) {
        $result[$row['Access ID']] = $row;
    }
}
var_export(array_values($result));
mickmackusa
  • 43,625
  • 12
  • 83
  • 136