16

So, I have some data in the form of:

<foo><bar>test</bar></foo>

What .NET classes/functions would I want to use to convert this to something pretty and write it out to a file looking something like this:

<foo>
   <bar>
       test
   </bar>
</foo>

Be specific on the functions and classes please, not just "use System.XML". There seems to be a lot of different ways to do things in .NET using XML :(

Thanks

DigitalZebra
  • 39,494
  • 39
  • 114
  • 146

4 Answers4

24

Using the System.Xml.XmlDocument class...

Dim Val As String = "&lt;foo&gt;&lt;bar&gt;test&lt;/bar&gt;&lt;/foo&gt;"
Dim Xml As String = HttpUtility.HtmlDecode(Val)

Dim Doc As New XmlDocument()
Doc.LoadXml(Xml)

Dim Writer As New StringWriter()
Doc.Save(Writer)

Console.Write(Writer.ToString())
Josh Stodola
  • 81,538
  • 47
  • 180
  • 227
  • Also, is there an alternative to the call to HttpUtility.HtmlDecode(str)?? I don't like having to pull in System.Web just for that function... – DigitalZebra Feb 04 '10 at 22:12
  • XmlDocument isn't actually doing anything at all here, as written. HtmlDecode is doing all of the work. If you skip the HtmlDecode call, and use XmlDocument to pull out XmlElements/XmlAttribute values (via .ChildNodes, .SelectNode[s], etc), the Values of those objects will be correctly unescaped. – technophile Feb 04 '10 at 22:15
  • @technophile... So I'm guessing XmlDocument will do that anyways? – DigitalZebra Feb 04 '10 at 22:16
  • 1
    @Polaris Yes, although if you just dump the XmlDocument to a string like he's doing here, it will re-escape them (because it's XML encoding the values). You need to use the XML APIs to pull the values out correctly. – technophile Feb 04 '10 at 22:18
  • Ah, wait, I see. You want to unescape and then pretty print the results. Yes, in that case using HTMLDecode to turn the entities back into angle brackets etc and using XmlDocument to insert whitespace is probably the best you'll get. – technophile Feb 04 '10 at 22:21
  • @Polaris Unfortunately, you'll have to have a reference to System.Web in order to use `HttpUtility`. You could roll your own decoding function, but it's a heck of a lot harder to decode HTML than encode, in my opinion. Perhaps you can look in Reflector and get what you need. – Josh Stodola Feb 05 '10 at 14:11
  • The question is about C#, why is people answering in Visual Basic.net?! – sergiol Jul 21 '22 at 09:36
13

you can use this code.

string p = "&lt;foo&gt;&lt;bar&gt;test&lt;/bar&gt;&lt;/foo&gt;";
Console.WriteLine(System.Web.HttpUtility.HtmlDecode(p));
Adeel
  • 19,075
  • 4
  • 46
  • 60
8

Use System.Net.WebUtility.HtmlDecode since .NET 4.0 if pretty printing is not important.

anvish
  • 515
  • 7
  • 8
-5

Here's one that I use, pass in an Xml string, set ToXml to true if you want to convert a string containing "<foo/><bar/>" to the native xml equivalent, "#lt;foo/#gt;#lt;bar#gt;" - replace the hash with the ampersand as this editor keeps escaping it...likewise, if ToXml is false, it will convert a string containing the "#lt;foo/#gt;#lt;bar#gt;" (replace the hash with the ampersand)to "<foo/><bar/>"

string XmlConvert(string sXml, bool ToXml){
    string sConvertd = string.Empty;
    if (ToXml){
       sConvertd = sXml.Replace("<", "#lt;").Replace(">", "#gt;").Replace("&", "#amp;");
    }else{
       sConvertd = sXml.Replace("#lt;", "<").Replace("#gt;", ">").Replace("#amp;", "&");
    }
    return sConvertd;
}

(replace the hash with the ampersand as this editor keeps escaping it within the pre tags)

Edit: Thanks to technophile for pointing out the obvious, but that is designed to cover only the XML tags. That's the gist of the function, which can be easily extended to cover other XML tags and feel free to add more that I may have missed out! Cheers! :)

t0mm13b
  • 34,087
  • 8
  • 78
  • 110
  • -1: Doesn't correctly handle all escaped values (Unicode values, other XML entity values, etc). – technophile Feb 04 '10 at 22:19
  • 1
    It won't handle quotes, either, which are pretty important in handling attributes. Using a specific list to try to do Replaces is inherently worse than using an API that conforms to the XML specification and will handle everything correctly without needing bandaids next time you want to handle " or whatever. – technophile Feb 04 '10 at 22:25