0

The task is to merge ("inexpensively") two arrays, which have matching key-value pairs of subarrays. E.g.:

Array 1:

Array
(
[0] => Array
    (
        [count] => 1
        [da_table] => article
        [da_class] => classes\elements\tables\Article
        [da_page_class] => Page_Article
    )

[1] => Array
    (
        [count] => 2
        [da_table] => client_contract_specification_price
        [da_class] => classes\elements\tables\ClientContractSpecificationPrice
        [da_page_class] => Page_ClientContractSpecification
    )

[2] => Array
    (
        [count] => 2
        [da_table] => supplier
        [da_class] => classes\elements\tables\Supplier
        [da_page_class] => Page_Supplier
    )

)

Array 2:

Array
(
[0] => Array
    (
        [name] => Articles
        [name_short] => 
        [da_page_class] => Page_Article
    )

[1] => Array
    (
        [name] => Client contract specifications
        [name_short] => cc_specifications
        [da_page_class] => Page_ClientContractSpecification
    )

[2] => Array
    (
        [name] => Suppliers
        [name_short] => 
        [da_page_class] => Page_Supplier
    )

)

How to merge the two above arrays by a matching [da_page_class] => ... pairs, so the resulting array will contain both key-values of the first and the second array, i.e.:

...
[0] => Array
    (
        [count] => 1
        [da_table] => article
        [da_class] => classes\elements\tables\Article
        [da_page_class] => Page_Article
        [name] => Articles
        [name_short] => 
    )
...

Additional requirements: Subarrays may come in random order. Also, there can be "orphans", which contain values of ['da_page_class'], but have no match in another array. These should be ignored.

bbe
  • 344
  • 3
  • 16

2 Answers2

3

Well, you simply iterate over the array elements and combine them:

<?php
$data1 = [
    [
        'count' => 1,
        'da_table' => 'article',
        'da_class' => 'classes\elements\tables\Article',
        'da_page_class' => 'Page_Article'
    ],
    [
        'count' => 2,
        'da_table' => 'client_contract_specification_price',
        'da_class' => 'classes\elements\tables\ClientContractSpecificationPrice',
        'da_page_class' => 'Page_ClientContractSpecification'
    ],
    [
        'count' => 2,
        'da_table' => 'supplier',
        'da_class' => 'classes\elements\tables\Supplier',
        'da_page_class' => 'Page_Supplier'
    ]
];

$data2 = [
    [
        'name' => 'Articles',
        'name_short' => null,
        'da_page_class' => 'Page_Article'
    ],
    [
        'name' => 'Client contract specifications',
        'name_short' => 'cc_specifications',
        'da_page_class' => 'Page_ClientContractSpecification'
    ],
    [
        'name' => 'Suppliers',
        'name_short' => null,
        'da_page_class' => 'Page_Supplier'
    ]
];

$output = [];
for ($i=0; $i<count($data1); $i++) {
    $output[$i] = array_merge($data1[$i], $data2[$i]);
}
print_r($output);

The output obviously is:

Array
(
    [0] => Array
        (
            [count] => 1
            [da_table] => article
            [da_class] => classes\elements\tables\Article
            [da_page_class] => Page_Article
            [name] => Articles
            [name_short] =>
        )

    [1] => Array
        (
            [count] => 2
            [da_table] => client_contract_specification_price
            [da_class] => classes\elements\tables\ClientContractSpecificationPrice
            [da_page_class] => Page_ClientContractSpecification
            [name] => Client contract specifications
            [name_short] => cc_specifications
        )

    [2] => Array
        (
            [count] => 2
            [da_table] => supplier
            [da_class] => classes\elements\tables\Supplier
            [da_page_class] => Page_Supplier
            [name] => Suppliers
            [name_short] =>
        )

)

Alternatively you could also merge the contents of the elements of the second array into the corresponding elements of the first array. That reduces the memory footprint for large data sets.


Considering the additional requirement you specified in your comment I changed the merge strategy to allow for arbitrary orders of the two sets:

<?php
$data1 = [
    [
        'count' => 1,
        'da_table' => 'article',
        'da_class' => 'classes\elements\tables\Article',
        'da_page_class' => 'Page_Article'
    ],
    [
        'count' => 2,
        'da_table' => 'client_contract_specification_price',
        'da_class' => 'classes\elements\tables\ClientContractSpecificationPrice',
        'da_page_class' => 'Page_ClientContractSpecification'
    ],
    [
        'count' => 2,
        'da_table' => 'supplier',
        'da_class' => 'classes\elements\tables\Supplier',
        'da_page_class' => 'Page_Supplier'
    ]
];

$data2 = [
    [
        'name' => 'Articles',
        'name_short' => null,
        'da_page_class' => 'Page_Article'
    ],
    [
        'name' => 'Client contract specifications',
        'name_short' => 'cc_specifications',
        'da_page_class' => 'Page_ClientContractSpecification'
    ],
    [
        'name' => 'Suppliers',
        'name_short' => null,
        'da_page_class' => 'Page_Supplier'
    ]
];

$output = [];
array_walk($data1, function($entry, $key) use (&$output, $data2) {
    $output[$key] = $entry;
    foreach($data2 as $cand) {
        if ($entry['da_page_class'] == $cand['da_page_class']) {
            $output[$key] = array_merge($output[$key], $cand);
        }
    }
});
print_r($output);

The resulting output obviously is identical to above.

arkascha
  • 41,620
  • 7
  • 58
  • 90
  • Subarrays may come in random order, so simple merge will not work. Also, there can be "orphans", which contain ['da_page_class'], but have no match in another array. – bbe Jun 30 '17 at 15:53
  • @bbe Ah, so you have additional requirements you did not specify in the question. Anything else you want to add? – arkascha Jun 30 '17 at 15:54
  • 1
    @bbe Ok, I added a version implementing a modified merge strategy that allows for arbitrary order of the set elements. – arkascha Jun 30 '17 at 16:00
  • @arkascha I'm not sure if this is still the case in current PHP versions but foreach used to be much faster than array_walk and array_map – masterfloda Jun 30 '17 at 16:07
  • masterfloda - not true; foreach and array_walk prove demonstrate almost identical results; https://stackoverflow.com/questions/18144782/performance-of-foreach-array-map-with-lambda-and-array-map-with-static-function – bbe Jun 30 '17 at 16:11
  • Thanx, arkascha. Works. Cleaned it up and unified to avoid using $output, but updating first array. The question is whether, given current situation, exists a scenario, which is less "expensive" than O(n2) solution. – bbe Jun 30 '17 at 16:20
  • Certainly that is possible, you'd have to sort the second set first by that attribute and then combine both sets as shown in the first example. Whether that is more efficient depends on 1. the size of the sets and 2. the efficiency of the sorting algorithm. – arkascha Jun 30 '17 at 16:21
  • @bbe thanks, I was too lazy to look it up :-) In that case array_map is the better choice simply because it's cleaner code. And I made a mistake, the complexity is not O(n2) but O(n*m) (array1 length * array 2 length). Thinking about it there is an obvious less complex solution... I'll update my answer. – masterfloda Jun 30 '17 at 16:41
  • arkascha, "sorting" solution "breaks" on "orphans"; besides, sorting itself is a somewhat expensive operation (in comparison to the stated problem). Anyway, works, and optimisation effect will be negligible on small datasets. – bbe Jun 30 '17 at 16:43
2

O(m*n) solution:

$result = array();

foreach ($array1 as $value1) {
    // do not handle elements without pageclass
    if (!array_key_exists('da_page_class', $value1) || !$value1['da_page_class']) {
        continue;
    }

    foreach ($array2 as $value2) {
        if (!array_key_exists('da_page_class', $value2) || !$value2['da_page_class']) {
            continue;
        }
        if ($value1['da_page_class'] == $value2['da_page_class']) {
            array_push($result, $value1 + $value2)
            break;
        }
    }
}

print_r($result);

O(m+n) solution:

$result = array();

foreach ($array1 as $value) {
    // do not handle elements without pageclass
    if (!array_key_exists('da_page_class', $value) || !$value['da_page_class']) {
        continue;
    }
    $result[$value['da_page_class']] = $value;
}
foreach ($array2 as $value) {
    if (
        // do not handle elements without pageclass         
        !array_key_exists('da_page_class', $value) || !$value['da_page_class'] ||
        // do not handle elements that do not exist in array 1
        !array_key_exists($value['da_page_class'], $result)
        ) {
        continue;
    }
    // merge values of this pageclass
    $result[$value['da_page_class']] = array_merge($result[$value['da_page_class']], $value);
}

print_r($result);

EDIT: added O(m+n) solution

masterfloda
  • 2,908
  • 1
  • 16
  • 27