I am currently trying to remove a large amount of data from a huge XML file. I am currently using Powershell to try do this and I was wondering if its even possbile to do it in a acceptable amount of time. This file contains 2.5m records and I want to remove any records where the attribute = 'COMPANY'. Here is my current code:
$xml = [xml]''
$xml.Load("C:\New folder\untrimmed.xml")
$node = $xml.SelectSingleNode("//record[@category='COMPANY']")
while ($node -ne $null) {
$node.ParentNode.RemoveChild($node)
$node = $xml.SelectSingleNode("//record[@category='COMPANY']")
$xml.save("C:\New folder\trimmed.xml")
After this is completed after an hour and a half, the trimmed down file is BIGGER in size than the original. How can I do this in a better way? Is powershell not the right tool for the job here?