6

SO,

The problem

Suppose we have flat array with following structure:

$array = [
  ['level'=>1, 'name' => 'Root #1'],
  ['level'=>1, 'name' => 'Root #2'],
  ['level'=>2, 'name' => 'subroot 2-1'],
  ['level'=>3, 'name' => '__subroot 2-1/1'],
  ['level'=>2, 'name' => 'subroot 2-2'],
  ['level'=>1, 'name' => 'Root #3']
];

The issue is - transform that array so it will became a tree. Subordination is determined only with elements order and level field. Let's define children as a name of dimension for storing child nodes. For array above that will be:

  array (
    array (
      'level' => 1,
      'name' => 'Root #1',
    ),
    array (
      'level' => 1,
      'name' => 'Root #2',
      'children' => 
      array (
        array (
          'level' => 2,
          'name' => 'subroot 2-1',
          'children' => 
          array (
            array (
              'level' => 3,
              'name' => '__subroot 2-1/1',
            ),
          ),
        ),
        array (
          'level' => 2,
          'name' => 'subroot 2-2',
        ),
      ),
    ),
    array (
      'level' => 1,
      'name' => 'Root #3',
    ),
  )

A little more clarifications, if it's not obvious who is parent for who: following code could easily visualize idea:

function visualize($array)
{
   foreach($array as $item)
   {
      echo(str_repeat('-', $item['level']).'['.$item['name'].']'.PHP_EOL);
   }
}
visualize($array);

-for array above it's:

-[Root #1]
-[Root #2]
--[subroot 2-1]
---[__subroot 2-1/1]
--[subroot 2-2]
-[Root #3]

Specifics

There are some restrictions both for desired solution and input array:

  • Input array is always valid: that means it's structure can always be refactored to tree structure. No such weird things as negative/non-numeric levels, no invalid levels structure, e t.c.
  • Input array can be huge and, currently, maximum level is not restricted
  • Solution must resolve a matter with single loop, so we can not split array to chunks, apply recursion or jump within array somehow. Just simple foreach (or another loop - it does not matter), only once, each element one-by-one should be handled.

My approach

Currently, I have solution with stack. I'm working with references and maintaining current element of stack to which writing will be done at current step. That is:

function getTree(&$array)
{
   $level = 0;
   $tree  = [];
   $stack = [&$tree];
   foreach($array as $item)
   {
      if($item['level']>$level) //expand stack for new items
      {
         //if there are child elements, add last to stack:
         $top = key($stack);
         if(count($stack[$top]))
         {
            end($stack[$top]);
            $stack[] = &$stack[$top][key($stack[$top])];
         }
         //add ['children'] dim to top stack element
         end($stack);
         $top = key($stack);
         $stack[$top]['children'] = [];
         $stack[] = &$stack[$top]['children'];
      }
      while($item['level']<$level--) //pop till certain level
      {
         //two times: one for last pointer, one for ['children'] dim
         array_pop($stack);
         array_pop($stack);
      }
      //add item to stack top:
      end($stack);
      $stack[key($stack)][] = $item;
      $level = $item['level'];
   }
   return $tree;
}

-since it's long enough, I've created a sample of usage & output.

The question

As you can see, my solution is quite long and it relies on references & array internal pointer handling (such things as end()), so the question is:

May be there are other - shorter and clearer ways to resolve this issue? It looks like some standard question, but I've not found any corresponding solution (there is one similar question - but there OP has exact parent_id subordination while I have not)

Community
  • 1
  • 1
Alma Do
  • 37,009
  • 9
  • 76
  • 105

2 Answers2

3

The good thing about your problem is that your input is always formatted properly so your actual problem is narrowed down to finding children for each node if they exist or finding parent for each node if it has one. The latter one is more suitable here, because we know that node has parent if its level is more than one and it is the nearest node above it in initial flat array with level that equals level of current node minus one. According to this we can just keep track on few nodes that we are interested in. To be more exact whenever we find two nodes with the same level, the node that was found earlier can't have more children.

Implementation of this will look like this:

function buildTree(array &$nodes) {
    $activeNodes = [];

    foreach ($nodes as $index => &$node) {
        //remove if you don't want empty ['children'] dim for nodes without childs
        $node['children'] = [];
        $level = $node['level'];
        $activeNodes[$level] = &$node;

        if ($level > 1) {
            $activeNodes[$level - 1]['children'][] = &$node;
            unset($nodes[$index]);
        }
    }
}

Demo

Leri
  • 12,367
  • 7
  • 43
  • 60
1

The implementation with using recursion:

 function buildTreeHelper(&$array, $currentLevel = 1)
 {
     $result = array();
     $lastIndex = 0;
     while($pair = each($array)) {
         list(, $row) = $pair;
         $level = $row['level'];
         if ($level > $currentLevel) {
             $result[$lastIndex]['children'] = buildTreeHelper($array, $level);
         } else if ($level == $currentLevel) {
             $result[++$lastIndex] = $row;
         } else {
             prev($array); // shift back
             break;
         }
     }
     return $result;
 }

 function buildTree($array)
 {
     reset($array);
     return buildTreeHelper($array);
 }

 $array = [
  ['level'=>1, 'name' => 'Root #1'],
  ['level'=>1, 'name' => 'Root #2'],
  ['level'=>2, 'name' => 'subroot 2-1'],
  ['level'=>3, 'name' => '__subroot 2-1/1'],
  ['level'=>2, 'name' => 'subroot 2-2'],
  ['level'=>1, 'name' => 'Root #3']
];

 print_r(buildTree($array));
Alexander
  • 807
  • 5
  • 10
  • Thanks you. Even if it does not fit my requirements - it still is interesting. However, I've done some measures after cleaning code in my function - and they both have ~same time for `1E6` iterations count and array, which is specified in question (`70 sec` total my version vs `74.5 sec` yours). I've also tested on easy array (all levels are `1`) and got similar results (`34 sec` my vs `39 sec` yours). It's quite as expected cause both function have same `big-O` estimation – Alma Do Nov 12 '13 at 12:53