This builds up nested lists, PowerShell ConvertTo-JSON flattens the outer list.
You can change the $Line in $s
to $line in (Get-Content input.txt)
.
But I think this does it:
$s = @'
appsystem
appsystem.applications
appsystem.applications.APPactivities
appsystem.applications.APPmanager
appsystem.applications.APPmodels
appsystem.applications.MAPmanager
appsystem.applications.MAPmanager.maphub
appsystem.applications.MAPmanager.mapmanager
appsystem.applications.pagealertsmanager
appsystem.authentication
appsystem.authentication.manager
appsystem.authentication.manager.encryptionmanager
appsystem.authentication.manager.sso
appsystem.authentication.manager.tokenmanager
'@ -split "`r`n"
$TreeRoot = New-Object System.Collections.ArrayList
foreach ($Line in $s) {
$CurrentDepth = $TreeRoot
$RemainingChunks = $Line.Split('.')
while ($RemainingChunks)
{
# If there is a dictionary at this depth then use it, otherwise create one.
$Item = $CurrentDepth | Where-Object {$_.name -eq $RemainingChunks[0]}
if (-not $Item)
{
$Item = @{name=$RemainingChunks[0]}
$null = $CurrentDepth.Add($Item)
}
# If there will be child nodes, look for a 'children' node, or create one.
if ($RemainingChunks.Count -gt 1)
{
if (-not $Item.ContainsKey('children'))
{
$Item['children'] = New-Object System.Collections.ArrayList
}
$CurrentDepth = $Item['children']
}
$RemainingChunks = $RemainingChunks[1..$RemainingChunks.Count]
}
}
$TreeRoot | ConvertTo-Json -Depth 1000
Edit: It's too slow? I tried some random pausing profiling and found (not too surprisingly) that it's the inner nested loop, which searches children
arrays for matching child nodes, which is being hit too many times.
This is a redesigned version which still builds the tree, and this time it also builds a TreeMap hashtable of shortcuts into the tree, to all the previously build nodes, so it can jump right too them instead of searching the children
lists for them.
I made a testing file, some 20k random lines. Original code processed it in 108 seconds, this one does it in 1.5 seconds and the output matches.
$TreeRoot = New-Object System.Collections.ArrayList
$TreeMap = @{}
foreach ($line in (Get-Content d:\out.txt)) {
$_ = ".$line" # easier if the lines start with a dot
if ($TreeMap.ContainsKey($_)) # Skip duplicate lines
{
continue
}
# build a subtree from the right. a.b.c.d.e -> e then d->e then c->d->e
# keep going until base 'a.b' reduces to something already in the tree, connect new bit to that.
$LineSubTree = $null
$TreeConnectionPoint = $null
do {
$lastDotPos = $_.LastIndexOf('.')
$leaf = $_.Substring($lastDotPos + 1)
$_ = $_.Substring(0, $lastDotPos)
# push the leaf on top of the growing subtree
$LineSubTree = if ($LineSubTree) {
@{"name"=$leaf; "children"=([System.Collections.ArrayList]@($LineSubTree))}
} else {
@{"name"=$leaf}
}
$TreeMap["$_.$leaf"] = $LineSubTree
} while (!($TreeConnectionPoint = $TreeMap[$_]) -and $_)
# Now we have a branch built to connect in to the existing tree
# but is there somewhere to put it?
if ($TreeConnectionPoint)
{
if ($TreeConnectionPoint.ContainsKey('children'))
{
$null = $TreeConnectionPoint['children'].Add($LineSubTree)
} else {
$TreeConnectionPoint['children'] = [System.Collections.ArrayList]@($LineSubTree)
}
} else
{ # nowhere to put it, this is a new root level connection
$null = $TreeRoot.Add($LineSubTree)
}
}
$TreeRoot | ConvertTo-Json -Depth 100
(@mklement0's code takes 103 seconds and produces a wildly different output - 5.4M characters of JSON instead of 10.1M characters of JSON. [Edit: because my code allows multiple root nodes in a list which my test file has, and their code does not allow that])
Auto-generated PS help links from my codeblock (if available):