2

I'm need to merge an array of rows into groups and use the lowest id in each group as the first level key. Within each group, all encountered ids (excluding the lowest) should be gathered in a subarray called mergedWith.

Sample input:

[
    1649 => ["firstName" => "jack", "lastName" => "straw"],
    1650 => ["firstName" => "jack", "lastName" => "straw"],
    1651 => ["firstName" => "jack", "lastName" => "straw"],
    1652 => ["firstName" => "jack", "lastName" => "straw"],
]

My desired result:

[
    1649 => [
        "firstName" => "jack"
        "lastName" => "straw"
        "mergedWith" => [1650, 1651, 1652]
    ]
]

I have a loop running that can pull out duplicates and find the lowest ID in the group, but not sure of the right way to collapse them into one.

I've shown the desired results of a search that has identified id's with duplicate entries in those particular fields. I just want to further refine it to not delete, but add a field on the end of each group that says ["mergedWith" => [1650, 1651, 1652]]

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
weekapaug
  • 332
  • 1
  • 4
  • 15
  • Show the code that you have and explain how the result differs from what you want. – Patrick Q Sep 20 '18 at 20:29
  • does the array always have the same values but with different keys? – mrbm Sep 20 '18 at 20:30
  • 1
    Create an associative array that uses the firstname/lastname as the keys. Loop through this array. Add the element to the associative array if the key doesn't exist, otherwise update the `mergedWith` column with this element's index. – Barmar Sep 20 '18 at 20:42
  • This might help you: https://stackoverflow.com/questions/307674/how-to-remove-duplicate-values-from-a-multi-dimensional-array-in-php – Salvatore Q Zeroastro Sep 20 '18 at 20:43
  • 1
    You should really update your question to include the information you added in your latest comment. It's a totally different ask. – M. Eriksson Sep 20 '18 at 20:44
  • added to question, removed from comments – weekapaug Sep 20 '18 at 20:48
  • `"id" =>'1650' "id" =>'1651' "id" =>'1652'` is an impossibility. Please provide a realistic [mcve]. – mickmackusa Jul 27 '22 at 06:14

4 Answers4

4

One way to do it is to group by first name and last name, and then reverse the grouping to get a single id. krsort the input beforehand to make sure you get the lowest id.

krsort($input);

//group
foreach ($input as $id => $person) {
    // overwrite the id each time, but since the input is sorted by id in descending order,
    // the last one will be the lowest id
    $names[$person['lastName']][$person['firstName']] = $id;
}

// ungroup to get the result
foreach ($names as $lastName => $firstNames) {
    foreach ($firstNames as $firstName => $id) {
        $result[$id] = ['firstName' => $firstName, 'lastName' => $lastName];
    }
}

Edit: not too much different based on your updated question. Just keep all the ids instead of a single one.

krsort($input);

foreach ($input as $id => $person) {
    //                   append instead of overwrite ↓ 
    $names[$person['lastName']][$person['firstName']][] = $id;
}
foreach ($names as $lastName => $firstNames) {
    foreach ($firstNames as $firstName => $ids) {
        // $ids is already in descending order based on the initial krsort
        $id = array_pop($ids);  // removes the last (lowest) id and returns it
        $result[$id] = [
            'firstName' => $firstName,
            'lastName' => $lastName,
            'merged_with' => implode(',', $ids)
        ];
    }
}
Don't Panic
  • 41,125
  • 10
  • 61
  • 80
  • Sorry for last minute change, as I started trying some things out i realized I needed notations on the merged accounts, not just to hide or delete them – weekapaug Sep 20 '18 at 20:54
  • 1
    @weekapaug It's okay as far as my answer is concerned, with this approach the code isn't much different either way. But be careful about modifying your question too much after asking. You really should avoid invalidating existing answers. – Don't Panic Sep 20 '18 at 20:55
  • Agreed, I have an issue where the tab key submits before I'm done, then I scramble to both try the answers and update all at once... – weekapaug Sep 20 '18 at 20:56
  • @weekapaug if you accidentally submit a question before you're ready, you can always quickly delete, make your edits, then undelete. – Don't Panic Sep 20 '18 at 20:58
  • I am working on my stack overflow etiquette, and appreciate the understanding – weekapaug Sep 20 '18 at 20:59
2
ksort($resArr);
$tempArr = array_unique($resArr, SORT_REGULAR);
foreach ($tempArr as $key => $value) {
    foreach ($resArr as $key1 => $value2) {
        if($value['firstName'] == $value2['firstName'] && $value['lastName'] == $value2['lastName']) {
            $tempArr[$key]["mergedWith"][] = $key1;
        }
    }
}
print_r($tempArr);

$resArr = array(1650 => array(
        "firstName" => "jack",
        "lastName" => "straw"
    ),1649 => array(
        "firstName" => "jack",
        "lastName" => "straw"
    )
    ,
    1651 => array(
        "firstName" => "jack",
        "lastName" => "straw"
    ),
    1652 => array(
        "firstName" => "jack",
        "lastName" => "straw"
    ),
    1653 => array(
        "firstName" => "jack1",
        "lastName" => "straw"
    ),
    1654 => array(
        "firstName" => "jack1",
        "lastName" => "straw"
));

Output
Array
(
    [1649] => Array
        (
            [firstName] => jack
            [lastName] => straw
            [mergedWith] => Array
                (
                    [0] => 1649
                    [1] => 1650
                    [2] => 1651
                    [3] => 1652
                )

        )

    [1653] => Array
        (
            [firstName] => jack1
            [lastName] => straw
            [mergedWith] => Array
                (
                    [0] => 1653
                    [1] => 1654
                )

        )

)
Bhuwan Bisht
  • 106
  • 4
0

@Don'tPanic's answer is using a preliminary loop to create a lookup array, then nested loops to form the desired result.

I recommend a simpler approach without nested loops. In the first loop, overpopulate the mergedWith element in each group -- this will be quite fast because there are no function calls and no conditions (aside from the null coalescing assignment operator, ??=). Then use a second loop to pull the first element from the mergedWith subarray -- this will apply the lowest id as the first level key and ensure that the first level key no longer exists in the group's subarray.

Code: (Demo)

ksort($array);
$temp = [];
foreach ($array as $key => $row) {
    $compositeKey = $row['firstName'] . '-' . $row['firstName'];
    $temp[$compositeKey] ??= $row;
    $temp[$compositeKey]['mergedWith'][] = $key;
}

$result = [];
foreach ($temp as $row) {
    $result[array_shift($row['mergedWith'])] = $row;
}
var_export($result);
mickmackusa
  • 43,625
  • 12
  • 83
  • 136
0

Assuming your first level keys are always in ascending order like in your sample array (otherwise just call ksort() to apply ascending sorting based on the first level), use a single loop with a reference variable. If the identifying values are encountered a second time, push the key into the reference and remove the current row from the original array.

Code: (Demo)

foreach ($array as $key => &$row) {
    $compositeKey = $row['firstName'] . '-' . $row['firstName'];
    if (!isset($ref[$compositeKey])) {
        $ref[$compositeKey] = &$row;
    } else {
        $ref[$compositeKey]['mergedWith'][] = $key;
        unset($array[$key]);
    }
}
var_export($array);
mickmackusa
  • 43,625
  • 12
  • 83
  • 136