I'm transforming a > 2GB file with a lookup template in the XSLT. I would like this to run faster but can't find any low hanging fruit to improve performance. Any help would be greatly appreciated. I'm a newb when it comes to transformations.
This is the current format of the XML file.
<?xml version="1.0" encoding="utf-8" ?>
<contacts>
<contact>
<attribute>
<name>text12</name>
<value>B00085590</value>
</attribute>
<attribute>
<name>text34</name>
<value>Atomos</value>
</attribute>
<attribute>
<name>date866</name>
<value>02/21/1991</value>
</attribute>
</contact>
<contact>
<attribute>
<name>text12</name>
<value>B00058478</value>
</attribute>
<attribute>
<name>text34</name>
<value>Balderas</value>
</attribute>
<attribute>
<name>date866</name>
<value>11/24/1997</value>
</attribute>
</contact>
</contacts>
The xslt I used for the transformation.
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
<xsl:output method="xml" indent="yes"/>
<!--Identify location of the lookup xml-->
<xsl:param name="lookupDoc" select="document('C:\Projects\Attributes.xml')" />
<!--Main Template-->
<xsl:template match="/contacts">
<!--Apply Formatted Contacts Template-->
<xsl:apply-templates select="contact" />
</xsl:template>
<!--Formatted Contacts Template-->
<xsl:template match="contact">
<contact>
<xsl:for-each select="attribute">
<!--Create variable to hold New Name after passing the Data Name to the Lookup Template-->
<xsl:variable name="newName">
<xsl:apply-templates select="$lookupDoc/attributes/attribute">
<xsl:with-param name="nameToMatch" select="name" />
</xsl:apply-templates>
</xsl:variable>
<!--Format Contact Element with New Name variable-->
<xsl:element name="{$newName}">
<xsl:value-of select="value"/>
</xsl:element>
</xsl:for-each>
</contact>
</xsl:template>
<!--Lookup Template-->
<xsl:template match="attributes/attribute">
<xsl:param name="nameToMatch" />
<xsl:value-of select='translate(translate(self::node()[name = $nameToMatch]/mappingname, "()*%$#@!~<>'&,.?[]=-+/\:1234567890", "")," ","")' />
</xsl:template>
</xsl:stylesheet>
Sample Lookup XML
<?xml version="1.0" encoding="utf-8" ?>
<attributes>
<attribute>
<name>text12</name>
<mappingname>ID</mappingname>
<datatype>Varchar2</datatype>
<size>30</size>
</attribute>
<attribute>
<name>text34</name>
<mappingname>Last Name</mappingname>
<datatype>Varchar2</datatype>
<size>30</size>
</attribute>
<attribute>
<name>date866</name>
<mappingname>DOB</mappingname>
<datatype>Date</datatype>
<size></size>
</attribute>
</attributes>
Transformed XML
<?xml version="1.0" encoding="utf-8" ?>
<contacts>
<contact>
<ID>B00085590</ID>
<LastName>Brady</LastName>
<DOB>02/21/1991</DOB>
</contact>
<contact>
<ID>B00058478</ID>
<LastName>Balderas</LastName>
<DOB>11/24/1997</DOB>
</contact>
</contacts>
C#
XsltSettings settings = new XsltSettings(true, true);
XslCompiledTransform ContactsXslt = new XslCompiledTransform();
ContactsXslt.Load(@"C:\Projects\ContactFormat.xslt", settings, new XmlUrlResolver());
using (XmlReader r = XmlReader.Create(@"C:\Projects\Contacts.xml")){
using (XmlWriter w = XmlWriter.Create(@"C:\Projects\FormattedContacts.xml")) {
w.WriteStartElement("contacts");
while (r.Read()) {
if (r.NodeType == XmlNodeType.Element && r.Name == "contact") {
XmlReader temp = new XmlTextReader(new StringReader(r.ReadOuterXml()));
ContactsXslt.Transform(temp, null, w);
}
}
}
}
The approach I'm taking is transforming 1 node at a time to avoid an OutOfMemoryException. Should I be feeding larger chunks through to speed up the process? Or am I going about this all wrong?