-1

the problem is a derivation from the following questions :

https://www.mrexcel.com/board/threads/make-a-formula-to-design-a-hierarchy-from-a-parent-child-relation.1133448/
http://www.vbaexpress.com/forum/showthread.php?67327-VBA-to-make-formula-for-parent-child
...

https://www.mrexcel.com/board/threads/build-parent-child-hierarchy.966430/
generating a tree like structure using parent-child ID's with VBA
...

Let's take the first question's example and work around that :

Parent Child
Pizza Margerita
Sandwich HotDog
Root Pizza
Margerita Cheese
Pizza Sea
Sea Tuna
Burger Bun
HotDog Sausage
Sea Shrimp
Root Sandwich
Sandwich Burger
Burger Steack
Burger Tomato
Burger Cucumber
HotDog Bun
Root Offer
Offer Pizza
Offer Sandwich
Margerita Tomato
Margerita Oliva
Sea Tomato
Sea Cheese

The original question was to turn this parent/child table into a proper hierarchy :

Level 1 Level 2 Level 3 Level 4
Root Pizza Margerita Tomato
Oliva
Cheese
Sea Tuna
Shrimp
Tomato
Cheese
Sandwich Burger Steack
Bun
Tomato
Cucumber
HotDog Sausage
Bun
Offer Pizza
Sandwich

Notice the 'Offer' entry and the two 'Pizza' and 'Sandwich' both related to 'Root' and 'Offer'.

However I'd like to turn each line into a flat hierarchy (the "longer" the better) :

Parent Child Level 1 Level 2 Level 3 Level 4 Level 5
Pizza Margerita Root Offer Pizza Margerita
Sandwich HotDog Root Offer Sandwich HotDog
Root Pizza Root Pizza
Margerita Cheese Root Offer Pizza Margerita Cheese
Pizza Sea Root Offer Pizza Sea
Sea Tuna Root Offer Pizza Sea Tuna
Burger Bun Root Offer Sandwich Burger Bun
HotDog Sausage Root Offer Sandwich HotDog Sausage
Sea Shrimp Root Offer Pizza Sea Shrimp
Root Sandwich Root Sandwich
Sandwich Burger Root Offer Sandwich Burger
Burger Steack Root Offer Sandwich Burger Steack
Burger Tomato Root Offer Sandwich Burger Tomato
Burger Cucumber Root Offer Sandwich Burger Cucumber
HotDog Bun Root Offer Sandwich HotDog Bun
Root Offer Root Offer
Offer Pizza Root Offer Pizza
Offer Sandwich Root Offer Sandwich
Margerita Tomato Root Offer Pizza Margerita Tomato
Margerita Oliva Root Offer Pizza Margerita Oliva
Sea Tomato Root Offer Pizza Sea Tomato
Sea Cheese Root Offer Pizza Sea Cheese

Please note that the original parent/child table has not to be modified/sorted. Each "child" path is to be flattened/expanded into a list of "folders".

The purpose is then to feed this flattened hierarchy into a Power Pivot Table.

I've made some trial and error based on the code extracted from the related questions, but the result is not satisfying :

Parent Child Level 1 Level 2 Level 3 Level 4 Level 5
Pizza Margerita Root Pizza Margerita 'Offer' ?
Sandwich HotDog Root Sandwich HotDog 'Offer' ?
Root Pizza Root Pizza OK
Margerita Cheese Root Pizza Margerita Cheese 'Offer' ?
Pizza Sea Root Pizza Sea 'Offer' ?
Sea Tuna Root Pizza Sea Tuna 'Offer' ?
Burger Bun Root Sandwich Burger Bun 'Offer' ?
HotDog Sausage Root Sandwich HotDog Sausage 'Offer' ?
Sea Shrimp Root Pizza Sea Shrimp 'Offer' ?
Root Sandwich Root Sandwich OK
Sandwich Burger Root Sandwich Burger 'Offer' ?
Burger Steack Root Sandwich Burger Steack 'Offer' ?
Burger Tomato Root Sandwich Burger Tomato 'Offer' ?
Burger Cucumber Root Sandwich Burger Cucumber 'Offer' ?
HotDog Bun Root Sandwich HotDog Bun 'Offer' ?
Root Offer Root Offer OK
Offer Pizza Root Offer Pizza OK
Offer Sandwich Root Offer Sandwich OK
Margerita Tomato Root Pizza Margerita Tomato 'Offer' ?
Margerita Oliva Root Pizza Margerita Oliva 'Offer' ?
Sea Tomato Root Pizza Sea Tomato 'Offer' ?
Sea Cheese Root Pizza Sea Cheese 'Offer' ?

And using the example above, I'm stuck on the 'Pizza' and 'Sandwich' related to either 'Root' or 'Offer', despite the later being the obvious answer.

As a matter of fact, the "depth" of each line's hierarchy should be known to reserve enough "levels".

And now the code

' Option Explicit

' https://www.appsloveworld.com/vba/100/7/filtering-2d-arrays-in-excel-vba
' https://stackoverflow.com/a/43047962/3069988
Function Filter2DArray(ByVal i_aSrc, ByVal i_nCol As Long, ByVal i_sFind As String, ByVal i_bTitle As Boolean)
    Dim l_aOut As Variant
    Dim l_aKeys As Variant
    Dim l_oDic As Variant
    'Dim l_oDic As Scripting.Dictionary
    Dim l_nRow As Long
    Dim l_ncol As Long
    Dim l_bChk As Boolean
    Dim l_nVal As Double
    
    On Error Resume Next
    
    ' https://excelmacromastery.com/vba-dictionary/
    Set l_oDic = CreateObject("Scripting.Dictionary")
    l_oDic.CompareMode = vbBinaryCompare    ' case sensitive
    'l_oDic.CompareMode = vbTextCompare     ' case insensitive
    
    i_nCol = i_nCol + LBound(i_aSrc, 2) - 1
    l_bChk = (InStr("><=", Left(i_sFind, 1)) > 0)
    
    ' Scan all rows, add row number when string found
    For l_nRow = LBound(i_aSrc, 1) - i_bTitle To UBound(i_aSrc, 1)
        If l_bChk Then
            l_nVal = CDbl(i_aSrc(l_nRow, i_nCol))
            If Evaluate(l_nVal & i_sFind) Then
                l_oDic.Add l_nRow, ""
            End If
        Else
            'If UCase(i_aSrc(l_nRow, i_nCol)) Like UCase("*" & i_sFind & "*") Then
            If UCase(i_aSrc(l_nRow, i_nCol)) Like UCase(i_sFind) Then
                l_oDic.Add l_nRow, ""
            End If
        End If
    Next
    
    ' If row numbers present, string found
    If l_oDic.Count > 0 Then
        ' Get row numbers
        l_aKeys = l_oDic.Keys
        ' Create array of found rows and all columns
        ReDim l_aOut(LBound(i_aSrc, 1) To UBound(l_aKeys) + LBound(i_aSrc, 1) - i_bTitle, LBound(i_aSrc, 2) To UBound(i_aSrc, 2))
        ' All found rows
        For l_nRow = LBound(i_aSrc, 1) - i_bTitle To UBound(l_aKeys) + LBound(i_aSrc, 1) - i_bTitle
            ' All source columns
            For l_ncol = LBound(i_aSrc, 2) To UBound(i_aSrc, 2)
                ' Copy row
                l_aOut(l_nRow, l_ncol) = i_aSrc(l_aKeys(l_nRow - LBound(i_aSrc, 1) + i_bTitle), l_ncol)
            Next
        Next
        If i_bTitle Then
            ' All source columns
            For l_ncol = LBound(i_aSrc, 2) To UBound(i_aSrc, 2)
                ' Copy title row
                l_aOut(LBound(i_aSrc, 1), l_ncol) = i_aSrc(LBound(i_aSrc, 1), l_ncol)
            Next
        End If
    End If
    
    ' Return result
    Filter2DArray = l_aOut
End Function

Sub FlattenParentChild()
    Dim l_oHiera As Variant
    Dim l_aHiera As Variant
    Dim l_aChild As Variant
    Dim l_aCurr() As Variant
    Dim l_aTemp() As Variant
    Dim l_iRow, l_iCol As Long
    Dim l_nRow, l_nCol As Long
    Dim l_nMin, l_nMax As Long
    Dim l_nOut As Long
    Dim l_sTemp As String
    
    ' https://excelmacromastery.com/vba-arraylist/
    Set l_aChild = CreateObject("System.Collections.ArrayList")
    
    ' Extract stacked Parent/Child table (position specific)
    Set l_oHiera = ActiveSheet.Range("A1:B" & [A65000].End(xlUp).Row)
    
    ' Get position
    l_nRow = l_oHiera.Row - 1
    l_nCol = l_oHiera.Column - 1
    
    ' Show range
    l_oHiera.Select
    
    ' Get Parent/Child pairs
    l_aHiera = l_oHiera.Value
    
    ' Out at column
    l_nOut = l_nCol + 4
        
    ' Get list of Childs
    l_aChild = Application.Transpose(Application.Index(l_aHiera, 0, 2))
    ' Application.Index(array, 0, col)
    ' Application.Index(array, row, 0)
    
    ' Stop updating screen (may increase speed by alot)
    'Application.ScreenUpdating = False
    
    ' Scan all parent/child pairs
    For l_iRow = LBound(l_aHiera) To UBound(l_aHiera)
        ' Copy current Parent/Child
        l_aTemp = Application.Index(l_aHiera, l_iRow, 0)
        
        ' Step depth, max 5
        For l_nMin = 1 To 5
            ' Place current Parent/Child (position specific)
            ActiveSheet.Range( _
                Cells(l_nRow + l_iRow, l_nOut), _
                Cells(l_nRow + l_iRow, l_nOut + 1)) = l_aTemp
            
            ' Get Parent
            l_sTemp = l_aTemp(1)
            
            ' Check that there is a Child
            If Not IsError(Application.Match(l_sTemp, l_aChild, 0)) Then
                ' Get rows where Parent is Child
                l_aCurr = Filter2DArray(l_aHiera, 2, l_sTemp, False)
                            
                l_nMax = UBound(l_aCurr)
                If 0 < l_nMax Then
                    ' Move flattened hierarchy to the right
                    ActiveSheet.Range( _
                        Cells(l_nRow + l_iRow, l_nOut + 1), _
                        Cells(l_nRow + l_iRow, l_nOut + 1 + l_nMin)) = _
                        ActiveSheet.Range( _
                            Cells(l_nRow + l_iRow, l_nOut), _
                            Cells(l_nRow + l_iRow, l_nOut + l_nMin)).Value
                    
                    While 0 < l_nMax
                        ' Extract row
                        l_aTemp = Application.Index(l_aCurr, l_nMax, 0)
                    
                        ' Place new Parent/Child (position specific)
                        ActiveSheet.Range( _
                            Cells(l_nRow + l_iRow, l_nOut), _
                            Cells(l_nRow + l_iRow, l_nOut + 1)) = l_aTemp
                            
                        ' RECURSIVE LOOP ???
                        
                        ' Get new Parent
                        l_sTemp = l_aTemp(1)
                        
                        ' Check that there is no Child (hence Root)
                        If IsError(Application.Match(l_sTemp, l_aChild, 0)) Then
                            Exit For
                        End If
                        
                        ' Next loop
                        l_nMax = l_nMax - 1
                    Wend
                End If
            Else
                Exit For
            End If
        Next
    Next
    
    Application.ScreenUpdating = True
End Sub

Regards.

Kochise
  • 504
  • 5
  • 10
  • 2
    If you show us your code, you'll get much better help. – Sam Apr 17 '23 at 12:36
  • " despite the later being the obvious answer" - is that really "obvious" though? – Tim Williams Apr 17 '23 at 15:52
  • @Sam Done, the code currently loop on all possible children and overwrite a better option ('Offer' -> 'Root') which produce strange outputs. – Kochise Apr 17 '23 at 16:04
  • I'd suggest your dataset represents multiple different trees, so using code it's difficult to see *why* for example this **Root > Offer > Pizza > Sea** is not just as valid as **Root > Pizza > Sea** – Tim Williams Apr 17 '23 at 21:07
  • @TimWilliams That's the problem, I'm using the original data as example, but the question remains valid. While the **Root > Pizza > Sea** path is valid, the **Root > Offer > Pizza > Sea** path is even more when flattening the hierarchy. Hence the big comment in my code : ' RECURSIVE LOOP ??? – Kochise Apr 18 '23 at 05:43
  • Could fill a dictionary with key = "Parent|Child", value = "Parent" (leave Root|Offer out of it), then use a loop with a function (or a recursive function but I have no experience with that yet) wherein you build an array with the outcome, if it's in the dictionary, it has a parent above and the loop continues. You'll have to adjust the array as you need it (i.e. put the Parent and Child in 1st and 2nd place and last and second last place in the array) – Notus_Panda Apr 19 '23 at 08:20
  • @Notus_Panda Yeah, perhaps, that's the hint I placed in my code, but I'm unsure about it since I don't want to make it too complicated, code have to be fast, best option would be to cache the dictionary for subsequent calls/rows, yet the problem lies in the way you'll start the lookup (which level you are) and get back to the root while choosing the longest, more elaborate, path. Currently it's based on the last choice, here **Offer > PIzza** is overwritten by **Root > Pizza** and the loop stops there. Should I explore all possible paths ? What added complexity ? – Kochise Apr 19 '23 at 09:08
  • If you remove Root from the equation, it shouldn't occur. Two dictionaries, one with all the links except for the ones with root `If Not fParent = "Root" Then...` (child = key, parent = value) and one with the links-combined (so you know where to start from if it's an actual ingredient, i.e. key = parent|child, value = parent). If you reached the end (can't find it first in the combined one nor in the total one), you add root since everything has root as the end-parent. – Notus_Panda Apr 19 '23 at 09:49
  • @Notus_Panda I think I'm getting the hint, could you elaborate a bit more with a light pseudo code ? – Kochise Apr 19 '23 at 09:54

1 Answers1

0

Since I don't like half-answering and wanted to give recursive function a try, here's my solution. It's not fully as I'd want it since Root being the first in line is hardcoded but that shouldn't be too hard to fix if need be

Recursive function:

Option Explicit

Public dictA As Dictionary
Public dictC As Dictionary

Sub parentChild()
    Dim sChild As String, sParent As String, sComb As String
    Dim lRow As Long, amt As Long, i As Long, j As Long, arr(), arrPart, arrAll, maxLvl As Long
    
    lRow = ThisWorkbook.Sheets(1).Range("A" & Rows.Count).End(xlUp).Row
    Set dictA = New Dictionary
    Set dictC = New Dictionary
    
    arr = ThisWorkbook.Sheets(1).Range("A2:B" & lRow).Value
    amt = UBound(arr, 1)
    For i = 1 To amt
        If Not arr(i, 1) = "Root" Then
            If Not dictA.Exists(arr(i, 2)) Then
                dictA.Add arr(i, 2), arr(i, 1) 'doesn't matter that we skip some of the last level ones
            End If
        End If
    Next i
    
    For i = 1 To amt 'filling combination dictionary for the ingredients
        sComb = arr(i, 1) & "|" & arr(i, 2)
        If Not dictC.Exists(sComb) Then
            dictC.Add sComb, arr(i, 1)
        End If
    Next i
    maxLvl = 2
    ReDim arrAll(1 To amt, 1 To amt - 1) 'absolute max, we'll reduce this later
    For i = 1 To UBound(arr, 1)
        sParent = arr(i, 1)
        sChild = arr(i, 2)
        arrPart = longChain(sParent, sChild)
        If UBound(arrPart) > maxLvl Then maxLvl = UBound(arrPart)
        For j = 1 To UBound(arrPart)
            arrAll(i, j) = arrPart(j)
        Next j
    Next i
    ReDim Preserve arrAll(1 To amt, 1 To maxLvl) 'as promised, cutting it down to its needed size
    Dim ws As Worksheet
    Set ws = ThisWorkbook.Sheets(2)
    With ws
        .Range("A1").Value = "Parent"
        .Range("B1").Value = "Child"
        For i = 1 To maxLvl - 2 'adjust to the starting parent/child
            .Cells(1, i + 2) = "Level " & i
        Next i
        .Range("A2").Resize(UBound(arrAll, 1), UBound(arrAll, 2)).Value = arrAll
    End With
    
    Set dictA = Nothing
    Set dictC = Nothing
End Sub

Function longChain(sParent As String, sChild As String, Optional arr As Variant) As Variant
    Dim comb As String, i As Long
    Dim arrRet()
    comb = sParent & "|" & sChild
    If Not dictC.Exists(comb) Then 'we're dealing with something else than the ingredients
        ReDim arrRet(1 To UBound(arr) + 1)
        For i = 1 To UBound(arrRet)
            If i < 3 Then
                arrRet(i) = arr(i)
            ElseIf i = 3 Then
                If dictA.Exists(sParent) Then 'should definitely exist if we got this far unless it's Root
                    arrRet(i) = dictA(sParent)
                Else
                    arrRet(i) = "Root"
                End If
            Else
                arrRet(i) = arr(i - 1)
            End If
        Next i
        If dictA.Exists(sParent) Then
            longChain = longChain(dictA(sParent), dictA(sParent), arrRet) 'passing the new sParent as sChild causes it to skip the combination dictionary
        Else 'end of the line
            longChain = arrRet
        End If
    Else 'ingredient level
        ReDim arrRet(1 To 4)
        arrRet(1) = sParent
        arrRet(2) = sChild
        arrRet(3) = sParent
        arrRet(4) = sChild
        If sParent = "Root" Then
            longChain = arrRet 'end of the line -- Root Offer, Root Pizza
        Else
            longChain = longChain(sParent, sParent, arrRet) 'same dealio here, we don't want to redo the beginning
        End If
    End If
End Function

I hope the comments are enough to help you out if you don't understand something and otherwise, feel free to ask away :)
If people have suggestions/comments about the code, feel free to share as well, I'm here to learn.

Notus_Panda
  • 1,402
  • 1
  • 3
  • 12
  • Hi, thank you for your solution, I'll give it a try. Just notice that "Root" isn't hard coded, it's just that it is currently the common parent of all. It could have been named whatever you want. You may even have two "roots" if you get to a parent that is not a children to any other parent. That what makes the exercise so interesting though. – Kochise Apr 20 '23 at 08:48
  • You'd have to make a preliminary check then to see which are the "ultimate" stops and then do a check each time if you've reached that (like I did with `If sParent = "Root"` but then with an "isInArray" function-check. I mentioned "Root" being hardcoded because it is in my solution, not that it should be :p – Notus_Panda Apr 20 '23 at 08:53