For different reason, I am reading tag by tag some large XML files (> 500MB).
When reading them, I am using two StringBuilder:
- one with the final result: finalSb
- one to read the content in order to check if we need to append it to the final one or not: newElementSb
I have the below code in a while loop:
case XmlNodeType.Element:
if (rootTag)
// Append to the finalSb StringBuilder
else
newElementSb.Append($"<{reader.Name}>");
break;
case XmlNodeType.Text:
newElementSb.Append(reader.Value);
break;
case XmlNodeType.EndElement:
if (rootTag)
// Append to the finalSb StringBuilder
else
newElementSb.Append($"</{reader.Name}>");
// Check if the content of newElementSb needs to be append to finalSb
// Append and Clear newElementSb if so.
if (CheckIfValid(newElementSb))
{
finalSb.Append(newElementSb.ToString());
newElementSb.Clear();
}
I read from Microsoft documentation that the Clear method is an equivalent of resetting the Length to 0... Which to me means if my "newElementSb" was taking 10MB, the data will still be in memory somewhere and only the length will be reset.
Am I missing something ?
Is there any best practice of definitively reset AND cleaning a StringBuilder in order to reuse it later?
Should I preferably reinitialize it to a new StringBuilder instead of calling the Clear() method ?
Thank you for your advice.