I have the following code
XElement element = new XElement("test", "a&b");
where
element.LastNode
contains the value "a&b"
.
i wanted to be it "a&b"
.
How do i replace this?
I have the following code
XElement element = new XElement("test", "a&b");
where
element.LastNode
contains the value "a&b"
.
i wanted to be it "a&b"
.
How do i replace this?
Wait a moment,
<test>a&b</test>
is not valid XML. You cannot make XML that looks like this. This is clarified by the XML standard.
&
has special meaning, it denotes an escaped character that may otherwise be invalid. An '&'
character is encoded as &
in XML.
for what its worth, this is invalid HTML for the same reason.
<!DOCTYPE html> <html> <body> a&b </body> </html>
If I write the code,
const string Value = "a&b";
var element = new XElement("test", Value);
Debug.Assert(
string.CompareOrdinal(Value, element.Value) == 0,
"XElement is mad");
it runs without error, XElement
encodes and decodes to and from XML as necessary.
To unescape or decode the XML element you simply read XElement.Value
.
If you want to make a document that looks like
<test>a&b</test>
you can but it is not XML or HTML, tools for working with HTML or XML won't intentionally help you. You'll have make your own Readers, Writers and Parsers.
The & is a reserved character so it will allways be encoded. So you have to decode:
Is this an option: HttpUtility.HtmlDecode Method (String)
Usage:
string decoded = HttpUtility.HtmlDecode("a&b");
// returns "a&b"
Try following:
public static string GetTextFromHTML(String htmlstring)
{
// replace all tags with spaces...
htmlstring= Regex.Replacehtmlstring)@"<(.|\n)*?>", " ");
// .. then eliminate all double spaces
while (htmlstring).Contains(" "))
{
htmlstring= htmlstring.Replace(" ", " ");
}
// clear out non-breaking spaces and & character code
htmlstring = htmlstring.Replace(" ", " ");
htmlstring = htmlstring.Replace("&", "&");
return htmlstring;
}