3

I have an indexed array of associative arrays like this:

[
    ['brand' => 'ABC', 'model' => 'xyz', 'size' => 13],
    ['brand' => 'QWE', 'model' => 'poi', 'size' => 23],
    ['brand' => 'ABC', 'model' => 'xyz', 'size' => 18]
];

I need to reduce/merge/restructure the data to group based on brand and model. If while grouping on these two columns, a brand & model combination occurs more than once, the size values should be formed into an indexed subarray. Otherwise, the size value can remain as a single string value.

My desired result:

[
    ['brand' => 'ABC', 'model' => 'xyz', 'size' => [13, 18]],
    ['brand' => 'QWE', 'model' => 'poi', 'size' => 23],
];
mickmackusa
  • 43,625
  • 12
  • 83
  • 136
Derick
  • 59
  • 3
  • 7

4 Answers4

3

In terms of the algorithm, you simply need to:

  1. Create an empty array.

  2. Scan each array element in the source array creating a new element (in the empty array) for each new brand/model encountered and adding the size sub-array.

  3. If there's already a brand/model entry, simply add the size to the sub-array if it's not already present.

You could implement this as follows (crude, but it works):

<?php
    // Test data.
    $sourceArray = array(array('brand'=>'ABC', 'model'=>'xyz', 'size'=>13),
                         array('brand'=>'QWE', 'model'=>'poi', 'size'=>23),
                         array('brand'=>'ABC', 'model'=>'xyz', 'size'=>18),
                        );
    $newArray = array();

    // Create a new array from the source array. 
    // We'll use the brand/model as a lookup.
    foreach($sourceArray as $element) {

        $elementKey = $element['brand'] . '_' . $element['model'];

        // Does this brand/model combo already exist?
        if(!isset($newArray[$elementKey])) {
            // No - create the new element.
            $newArray[$elementKey] = array('brand'=>$element['brand'],
                                           'model'=>$element['model'], 
                                           'size'=>array($element['size']),
                                           );
        }
        else {
            // Yes - add the size (if it's not already present).
            if(!in_array($element['size'], $newArray[$elementKey]['size'])) {
                $newArray[$elementKey]['size'][] = $element['size'];
            }
        }
    }

    // *** DEBUG ***
    print_r($newArray);
?>

Incidentally, for ease of access I've made it so that the size sub-array is always an array. (i.e.: You don't have to allow for it to potentially only be an element.)

John Parker
  • 54,048
  • 11
  • 129
  • 129
  • 1
    It probably should be `if(!in_array($element['size'], $newArray[$elementKey]['size']))` though, to only add new sizes if brand/model already exists :p. – wimvds Aug 12 '10 at 15:24
  • @middaparka: Try it with an array containing a duplicate size, you'll see what happens :p. – wimvds Aug 12 '10 at 15:32
  • Is it possible to use this code without know the keys? The data is not uniform and the keys can be any word. For example in replacement of: $elementKey = $element['brand'] . '_' . $element['model']; Something like: $elementKey = $element[$key1] . '_' . $element[$key2]; ?? – Derick Aug 12 '10 at 15:44
  • @Derick - Not really if only because of the fact that you need to generate a lookup from some of the keys. That said, as long you always had 'n' keys (which were used in the lookup) and the rest of the data (other than size) was irrelevant you could use array_keys to add the rest during the "No - create the new element" bit. – John Parker Aug 12 '10 at 16:53
0
//$array is the array in your first example.

foreach($array as $item) {
  $itemname = $item["brand"] . "_" . $item["model"]

  $new_array[$itemname]["brand"]  = $item["brand"];
  $new_array[$itemname]["model"]  = $item["model"];
  $new_array[$itemname]["size"][] = $item["size"];
}
Knarf
  • 1,282
  • 3
  • 12
  • 31
0

"Upgrade" to knarf's snippet....

foreach($array as $item) {
  $itemname = $item["brand"] . "_" . $item["model"]

  $new_array[$itemname]["brand"]  = $item["brand"];
  $new_array[$itemname]["model"]  = $item["model"];
  $new_array[$itemname]["size"][ $item["size"] ] = 1;
}

foreach($new_array as $itemname=>$data) {
  if(isset($data['size']) && is_array($data['size'])) {
    $new_array[$itemname]['size']=array_keys($new_array[$itemname]['size']);
  }
}

No duplicates anymore...

vlad b.
  • 695
  • 5
  • 14
  • `if(isset($data['size']) && is_array($data['size'])) {` is 100% useless. The subarray key is unconditionally declared and will always be an array and will always be non-empty. – mickmackusa Jun 14 '20 at 11:17
  • In a perfect world I would agree with you. Writing overly-verbose validation helped make the intent more clear when reading the code months/years later and also is a good practice to validate stuff - you never know what and how much code gets added in the middle of existing code. This kind of coding style saved the day multiple times over the last year - I'm keeping it. – vlad b. Jun 15 '20 at 07:18
  • No, you aren't undestanding me. This isn't my opinion. It is a true, logical, provable fact. You are writing a second loop on the `$new_array` array that your first loop generates. So, this is not about "a perfect world" -- this is about the reality of computer science. Your script will consistently have no reason to perform these checks. Arguing anything to the contrary is pointless, is likely to confuse researchers, and teaches senseless coding practices. – mickmackusa Jun 15 '20 at 11:22
0

I am going to interpret this question very literally and provide the exact described output structure.

Temporary compound keys allow the very swift lookup of previously encountered brand - model pairs. It is important that a delimiting character (that is not used in either value) is used to separate each value in the compound string so that there are no accidental "data collisions".

If a given "brand-model" combination occurs only once, the original row structure is pushed into the result array. Otherwise, the "size" data is converted into an indexed array and subsequent unique size values are pushed into the subarray.

Classic foreach(): (Demo) (with array_unique to remove potential duplicate sizes)

$result = [];
foreach ($array as $row) {
    $compositeKey = $row['brand'] . '_' . $row['model'];
    if (!isset($result[$compositeKey])) {
        $result[$compositeKey] = $row;
    } else {
        $result[$compositeKey]['size'] = array_merge(
            (array)$result[$compositeKey]['size'],
            [$row['size']]
        );
    }
}
var_export($result);

Functional programming with array_reduce(): (Demo)

var_export(
    array_values(
        array_reduce(
            $array,
            function ($carry, $row) {
                $compositeKey = $row['brand'] . '_' . $row['model'];
                if (!isset($carry[$compositeKey])) {
                    $carry[$compositeKey] = $row;
                } else {
                    $carry[$compositeKey]['size'] = array_merge(
                        (array)$carry[$compositeKey]['size'],
                        [$row['size']]
                    );
                }
                return $carry;
            }
        )
    )
);

If I am being honest, I would create a consistent data structure for my output and size would ALWAYS be a subarray. Here's how to modify the above snippet to cast the size element as an array on the first encounter and push all subsequently encountered size values into that group's subarray: (Demo)

$result = [];
foreach ($array as $row) {
    $compositeKey = $row['brand'] . '_' . $row['model'];
    if (!isset($result[$compositeKey])) {
        $row['size'] = (array)$row['size'];
        $result[$compositeKey] = $row;
    } else {
        $result[$compositeKey]['size'][] = $row['size'];
    }
}
var_export(array_values($result));
mickmackusa
  • 43,625
  • 12
  • 83
  • 136