The below script works as expected to get the desired output, but it takes a long time to process large XML files (2GB and above). Calling on experts for suggestions on how to make it faster by multi threading or using some other technique in powershell script.
Reference Post - to know more about the logic of below script: Parse XML to extract data with grouping in PowerShell
# Create XML object to load data into
$xml = New-Object -TypeName System.Xml.XmlDocument
# Load in XML file
$xml.Load("test.xml")
# Group XML child nodes by Priority
$groups = $xml.'ABC-FOF-PROCESS'.ChildNodes | Group-Object -Property PRIORITY
# Iterate groups and create PSCustomObject for each grouping
& {
foreach ($group in $groups)
{
[PSCustomObject]@{
PRIORITY = [int]$group.Name
KEY = ($group.Group.KEY | Select-Object -Unique).Count
HITS = $group.Count
}
}
} | Sort-Object -Property PRIORITY -Descending | Out-File -FilePath output.txt
# Pipe output here
Output:
PRIORITY KEY HITS
-------- --- ----
1 1 1
-3 2 2
-14 2 3
xml:
<ABC-FOF-PROCESS>
<H>
<PRIORITY>-14</PRIORITY>
<KEY>F637A146-3437AB82-BA659D4A-17AC7FBF</KEY>
</H>
<H>
<PRIORITY>-14</PRIORITY>
<KEY>F637A146-3437AB82-BA659D4A-17AC7FBF</KEY>
</H>
<H>
<PRIORITY>-3</PRIORITY>
<KEY>D6306210-CF424F11-8E2D3496-E6CE1CA7</KEY>
</H>
<H>
<PRIORITY>1</PRIORITY>
<KEY>D6306210-CF424F11-8E2D3496-E6CE1CA7</KEY>
</H>
<H>
<PRIORITY>-3</PRIORITY>
<KEY>4EFR02B4-ADFDAF12-3C123II2-ADAFADFD</KEY>
</H>
<H>
<PRIORITY>-14</PRIORITY>
<KEY>5D2702B2-ECE8F1FB-3CEC3229-5FE4C4BC</KEY>
</H>
</ABC-FOF-PROCESS>