For very large text files we have the option of using StreamReader and StreamWriter, which then allows for doing find/replace on a line by line bases. However, I have an XML file where I need to do find/replace with a little more control, for example find/replace on a value in a particular node that is a child node of another node with a particular attribute and value. So, rather complex to try to parse line by line, and super easy to deal with when using an XML document. However, my file is pushing 500 MB and 12 million lines, and just loading the file takes an excessively long time. Is there a .NET equivalent for XML? Or am I limited to native PowerShell here, with the associated performance hit?
Asked
Active
Viewed 3,391 times
1 Answers
3
You might want to look at What is the difference between SAX and DOM? for information on alternative ways of parsing XML.
SAX might be a good method for you.
PowerShell and .Net itself don't have a native SAX parser, but the XmlReader class might work for you.
From the looks of the examples on the MSDN Docs, it doesn't seem to do anything too crazy or use features that are tedious/difficult in PowerShell.
Here's their example C#:
// Create a validating XmlReader object. The schema
// provides the necessary type information.
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.Schemas.Add("urn:empl-hire", "hireDate.xsd");
using (XmlReader reader = XmlReader.Create("hireDate.xml", settings)) {
// Move to the hire-date element.
reader.MoveToContent();
reader.ReadToDescendant("hire-date");
// Return the hire-date as a DateTime object.
DateTime hireDate = reader.ReadElementContentAsDateTime();
Console.WriteLine("Six Month Review Date: {0}", hireDate.AddMonths(6));
}
Here's a PowerShell port that I didn't bother to test at all (sorry):
# Create a validating XmlReader object. The schema
# provides the necessary type information.
$settings = New-Object System.Xml.XmlReaderSettings
$settings.ValidationType = [System.Xml.ValidationType]::Schema
$settings.Schemas.Add("urn:empl-hire", "hireDate.xsd")
# see their page for example XML/XSD
try {
$reader = [System.Xml.XmlReader]::Create("hireDate.xml", $settings)
# Move to the hire-date element.
$reader.MoveToContent();
$reader.ReadToDescendant("hire-date");
# Return the hire-date as a DateTime object.
$hireDate = $reader.ReadElementContentAsDateTime()
"Six Month Review Date: {0}" -f $hireDate.AddMonths(6) | Write-Verbose -Verbose
} finally {
$reader.Dispose()
}

briantist
- 45,546
- 6
- 82
- 127