3

Based on previous answers here at StackOverflow, I am using the following statement to remove all empty elements (except those having attributes):

XDocument xdoc = XDocument.Parse(xmlString);

xdoc.Descendants()
    .Where(e => !e.HasAttributes && (e.IsEmpty || String.IsNullOrWhiteSpace(e.Value)))
    .Remove();

When run against the following XML:

<MESSAGE>
    <RELATIONSHIPS1>
        <RELATIONSHIP1 from="10017" to="1"/>
    </RELATIONSHIPS1>
    <RELATIONSHIPS2>
        <RELATIONSHIP2 from="10017" to="1"></RELATIONSHIP2>
    </RELATIONSHIPS2>
    <RELATIONSHIPS3>
        <RELATIONSHIP3 from="10017" to="1">test</RELATIONSHIP3>
    </RELATIONSHIPS3>
    <RELATIONSHIPS4 attr="test">
        <RELATIONSHIP4 from="10017" to="1"></RELATIONSHIP4>
    </RELATIONSHIPS4>
    <EXTENSION a1="1" a2="2"/>
    <FLOOD>
        <FLOOD_RESPONSE>
            <PROPERTY>
                <PROPERTY_DETAIL/>
                <PROPERTY_DETAIL></PROPERTY_DETAIL>
             </PROPERTY>
        </FLOOD_RESPONSE>
    </FLOOD>
</MESSAGE>

I was expecting the following:

<MESSAGE>
    <RELATIONSHIPS1>
        <RELATIONSHIP1 from="10017" to="1"/>
    </RELATIONSHIPS1>
    <RELATIONSHIPS2>
        <RELATIONSHIP2 from="10017" to="1"></RELATIONSHIP2>
    </RELATIONSHIPS2>
    <RELATIONSHIPS3>
        <RELATIONSHIP3 from="10017" to="1">test</RELATIONSHIP3>
    </RELATIONSHIPS3>
    <RELATIONSHIPS4 attr="test">
        <RELATIONSHIP4 from="10017" to="1"></RELATIONSHIP4>
    </RELATIONSHIPS4>
    <EXTENSION a1="1" a2="2" />
</MESSAGE>

But received the following:

<MESSAGE>
  <RELATIONSHIPS3>
    <RELATIONSHIP3 from="10017" to="1">test</RELATIONSHIP3>
  </RELATIONSHIPS3>
  <RELATIONSHIPS4 attr="test">
    <RELATIONSHIP4 from="10017" to="1"></RELATIONSHIP4>
  </RELATIONSHIPS4>
  <EXTENSION a1="1" a2="2" />
</MESSAGE>

Any ideas on the nested empty elements with attributes are being removed?

gcm
  • 33
  • 3
  • You need recursion and I don't think you can do recursion through LINQ. So you will need to write a separate recursive function. See http://stackoverflow.com/questions/4814242 and http://stackoverflow.com/questions/20974248 and http://stackoverflow.com/questions/21262391 – Moby Disk Apr 07 '15 at 18:30

1 Answers1

3

Check the description of XElement.Value on MSDN:

Gets or sets the concatenated text contents of this element.

This means that only value of inner elements (only elements, not attributes) is returned. So for example in your XML you have this element:

<RELATIONSHIPS1>
    <RELATIONSHIP1 from="10017" to="1"/>
</RELATIONSHIPS1>

The e.Value of this element is empty string, so this element is removed.

You can run such query to remove the elements you don't need. Basically you need to check for the elements, that have empty values and do not have any attributes and also that all it's descendants have empty values and no attributes:

xdoc.Descendants()
    .Where(e => !e.HasAttributes && 
                string.IsNullOrEmpty(e.Value) &&
                e.Descendants().All(f=>String.IsNullOrEmpty(f.Value) && !f.HasAttributes))
    .Remove();
dotnetom
  • 24,551
  • 9
  • 51
  • 54