1

I have a four-level multidimensional array. I need to sort in ascending order (ASC) the numeric "leaves" in order to calculate the median of the values.

I tried array_walk_recursive(), array_multisort(), usort(), etc. but was unable to find a working solution.

Here's a schematic of the array:

(
    [2017-05-01] => Array
        (
            [DC] => Array
                (
                    [IT] => Array
                        (
                            [0] => 90
                            [1] => 0
                        )    
                    [DE] => Array
                        (
                            [0] => 18
                            [1] => 315
                            [2] => 40
                            [3] => 
                            [4] => 69
                        )    
                    [Other] => Array
                        (
                            [0] => 107
                            [1] => 46
                            [2] => 
                            [3] => 
                            [4] => 27
                            [5] => 22
                        )    
                )
        )
)
mickmackusa
  • 43,625
  • 12
  • 83
  • 136
fraktal12
  • 101
  • 1
  • 9
  • 1
    Do you need to calculate the median for each array? (e.g median of the IT array, then the DE array, and so on... – MinistryOfChaps Jun 19 '17 at 22:19
  • Have you tried `sort($a['2017-05-01']['DC']['IT'])`? Assuming all the arrays you want to sort are at the same depth you can easily craft some nested `foreach` to sort all of them. – axiac Jun 20 '17 at 07:27
  • @MinistryofChaps - Yes, I need the median of each array. – fraktal12 Jun 20 '17 at 07:40

2 Answers2

2

As it turns out, there is a way to do what the OP seeks using a combination of usort() and array_walk(), each of which takes a callback, as follows:

<?php
// median code: 
//http://www.mdj.us/web-development/php-programming/calculating-the-median-average-values-of-an-array-with-php/

function calculate_median($arr) {
    sort($arr);
    $count = count($arr); //total numbers in array
    $middleval = floor(($count-1)/2); // find the middle value, or the lowest middle value
    if($count % 2) { // odd number, middle is the median
        $median = $arr[$middleval];
    } else { // even number, calculate avg of 2 medians
        $low = $arr[$middleval];
        $high = $arr[$middleval+1];
        $median = (($low+$high)/2);
    }
    return $median;
}


$a = [];
$a["2017-05-01"] = ["DC"];

$a["2017-05-01"]["DC"]["IT"] = [90,0];
$a["2017-05-01"]["DC"]["DE"] = [18,315,40,"",69];
$a["2017-05-01"]["DC"]["Other"] = [107,46,"","",27,22];


function sort_by_order ($a, $b)
{
     if ($a == "") $a = 0;
     if ($b == "") $b = 0;
     return $a - $b;
}

function test($item,$key){
    echo $key," ";
    if (is_array($item)) {
       echo array_keys($item)[1],"\n";
       $popped = array_pop($item);
       foreach ($popped as $key => $arr) {
          usort($arr, 'sort_by_order');
          echo "Median ($key): ",calculate_median( $arr ),"\n";
        }
     }
}

array_walk($a, 'test');

See demo here. Also, see this example based on the OP's sandbox.

Although the OP's code does not show the array keys as quoted, beware they should be in the actual code, otherwise PHP will do math with 2017-05-01 and you'll see a key of 2011. Interesting read here about usort.

The median code I extracted from here.

Interestingly, the conventional wisdom about sorting numbers to determine the median is not necessarily the only way to obtain that result. Apparently, it can also be done and perhaps more efficiently by finding a pivot number and dividing the series of numbers into three parts (see this response).

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
slevy1
  • 3,797
  • 2
  • 27
  • 33
2

This will output the deepest subarrays' median values using the input array's structure.

I'm including hard-casting of median values (one or both in a subset) as integers in the event that the value(s) are empty strings. I'll also assume that you will want 0 as the output if a subset is empty.

Code: (Demo)

$array=[
    '2017-05-01'=>[
        'DC'=>[
            'IT'=>[90, 0],
            'DE'=>[18, 315, 40, '', 69, 211],
            'Other'=>[107, 46, '', '', 27, 22]
        ]
    ],
    '2017-05-02'=>[
        'DC'=>[
            'IT'=>[70, 40, 55],
            'DE'=>['', 31, 4, '', 9],
            'Other'=>[1107, 12, 0, 20, 1, 11, 21]
        ]
    ],
    'fringe case'=>[
        'DC'=>[
            'IT'=>[],
            'DE'=>['', '', '', 99],
            'Other'=>['', 99]
        ]
    ]
];

foreach ($array as $k1 => $lv1) {
    foreach ($lv1 as $k2 => $lv2) {
        foreach ($lv2 as $k3 => $lv3) {
            sort($lv3);                  // order values ASC
            $count = sizeof($lv3);       // count number of values
            $index = floor($count / 2);  // get middle index or upper of middle two
            if (!$count) {               // count is zero
                $medians[$k1][$k2][$k3] = 0;
            } elseif ($count & 1) {      // count is odd
                $medians[$k1][$k2][$k3] = (int)$lv3[$index];                        // single median
            } else {                     // count is even
                $medians[$k1][$k2][$k3] = ((int)$lv3[$index-1] + (int)$lv3[$index]) / 2; // dual median
            }
        }
    }
}
var_export($medians);

Output:

 array (
  '2017-05-01' => 
  array (
    'DC' => 
    array (
      'IT' => 45,
      'DE' => 54.5,
      'Other' => 24.5,
    ),
  ),
  '2017-05-02' => 
  array (
    'DC' => 
    array (
      'IT' => 55,
      'DE' => 4,
      'Other' => 12,
    ),
  ),
  'fringe case' => 
  array (
    'DC' => 
    array (
      'IT' => 0,
      'DE' => 0,
      'Other' => 49.5,
    ),
  ),
)

*for the record, $count & 1 is a bitwise comparison that determines if the value is odd without performing arithmetic (and is the most efficient way of performing this check within php).

*also, if you wanted to simply overwrite the values of the input array, you could modify by reference by writing & before $lv1, $lv2, and $lv3 in the foreach declarations then save the median value to $lv3. Demo The benefit in doing so removes key declarations and making your code more brief.

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
  • @fraktal12 please provide specific feedback if this is not want you want. – mickmackusa Jun 20 '17 at 01:52
  • Hi Mick, thanks for your answer. I started with your code and adapt it a little: in any case, the array must be sorted. Then, for uneven count length - we take the middle value, for even count - we do the average of the mid 2 values. Of course, I assume the depth is constant (foreach in foreach in foreach...), so, can be useful to get to the point where we can use it for any depth. – fraktal12 Jun 20 '17 at 08:37
  • Demo here: http://sandbox.onlinephpfunctions.com/code/03287d363e3bc2cc3c25fd7cb01e2ac076eb29ec – fraktal12 Jun 20 '17 at 08:47
  • @fraktal12 I want to have a closer look at your code and perhaps make some refinements. I am spending time with my family right now, I'll revisit as soon as I can. (asort is not necessary because you don't need to preserve the keys, right?) – mickmackusa Jun 20 '17 at 09:18
  • Mick, you solved my problem :), so really no rush. And you are right with asort - habit. – fraktal12 Jun 20 '17 at 10:08
  • @fraktal12 My family is off to bed now. I've just updated my answer -- I don't think I can make it any more lean than it is. Sadly it looks a lot like slevy1's answer, but that is unavoidable because it is the most optimized way I can think of. Do you have any other considerations to pack into this question? How will you use this? Will you have varying array structures? Will you only want to target one deep subarray at a time? Are you just going to display everything to screen? – mickmackusa Jun 20 '17 at 12:54
  • Hi Mick, sorry for delayed response. I have a variable output, luckily not as of depth, so the code you worked will o the trick. I will use all dimensions in a chart - as categories or series and the median values to be displayed. – fraktal12 Jun 22 '17 at 07:49
  • slevy's answer modifies your input data and does an unnecessary sort while array_walking – mickmackusa Jul 12 '17 at 00:07