I am editing a series of XML files, and I need to remove all attributes with the name "foo". This attribute appears in more than one type of element. An example snippet from the XML might be:
<bodymatter id="######">
<level1 id="######">
<pagenum page="#####" id="######" foo="######" />
<h1 id="#####" foo="#####">Header</h1>
<imggroup id="#######">
.
.
etc.
The best solution I have uses Regex:
Regex regex = new Regex("foo=\"" + ".*?" + "\"", RegexOptions.Singleline);
content = regex.Replace(content, "");
I know built-in XML parsers could help, but ideally I want to make simple XML replacements/removals without having to deal with the baggage of an entire XML parser. Is Regex the best solution in this case?
Edit:
After some research in the XmlDocument class, here is one possible solution I came up with (to remove more than one attribute type stored in the array "ids"):
private void removeAttributesbyName(string[] ids)
{
XmlDocument doc = new XmlDocument();
doc.Load(path);
XmlNodeList xnlNodes = doc.GetElementsByTagName("*");
foreach (XmlElement el in xnlNodes)
{
for (int i = 0; i <= ids.Length - 1; i++)
{
if (el.HasAttribute(ids[i]))
{
el.RemoveAttribute(ids[i]);
}
if (el.HasChildNodes)
{
foreach (XmlNode child in el.ChildNodes)
{
if (child is XmlElement && (child as XmlElement).HasAttribute(ids[i]))
{
(child as XmlElement).RemoveAttribute(ids[i]);
}
}
}
}
}
}
I don't know if this is as efficient as it possibly could be, but I've tested it and it seems to work fine.