0

I am facing a strange problem when encoding a URL in an HTML Attribute.

I have the following HTML:

<a href=" https://www.google.co.in/#q=Pune&tbm=nws"></a>

This works fine so far.

However this HTML is generated dynamically using XmlTextWriter.

Hence the code generates the following XML

<a href=" https://www.google.co.in/#q=Pune&amp;tbm=nws"></a>

Note the &amp; after Pune When this link is clicked the browser is unable to decode the tbm=nws parameter.

I read several articles which seemed to suggest that the second HTML above is perfectly valid.

Can you guide me on where could this be going wrong?

EDIT: Adding C# code

 XmlTextWriter writer = new XmlTextWriter (Console.Out);
 writer.Formatting = Formatting.Indented;

 // Write the root element.
 writer.WriteStartElement("Items");

 // Write a string using WriteRaw. Note that the special
 // characters are not escaped.
 writer.WriteStartElement("Item");
 writer.WriteAttributeString("href","https://www.google.co.in/#q=Pune&tbm=nws");
 writer.WriteString("Write unescaped text:  ");
 writer.WriteRaw("this & that");
 writer.WriteEndElement();

 // Write the same string using WriteString. Note that the 
 // special characters are escaped.
 writer.WriteStartElement("Item");
 writer.WriteString("Write the same string using WriteString:  ");
 writer.WriteString("this & that");
 writer.WriteEndElement();

 // Write the close tag for the root element.
 writer.WriteEndElement();

 // Write the XML to file and close the writer.
 writer.Close();  
Viking22
  • 545
  • 1
  • 7
  • 19
  • Note that your HTML is not rendered correctly using Markdown. Format your HTML as code blocks. – Patrick Hofman Mar 11 '16 at 12:13
  • Sorry doing this over a mobile browser. Will format it as soon as I get access to a pc – Viking22 Mar 11 '16 at 12:14
  • No need to. I already did, but you rolled back my edit. – Patrick Hofman Mar 11 '16 at 12:14
  • What do you mean by _"the browser is unable to decode the tbm=nws parameter."_? As you mention, the second form is the technically correct form of using an ampersand in an attribute. – James Thorpe Mar 11 '16 at 12:20
  • It means that browser does not recognize tbm as a valid querystring parameter. Instead it considers amp;tbm as the parameter name and sets its value as nws. Even Fiddler WebForms inspector shows querystring parameter name as amp;tbm. – Viking22 Mar 11 '16 at 12:38
  • Are you sure it's not mistakenly double encoding it? Since you've tagged this c#, can you show the code that's generating the link? – James Thorpe Mar 11 '16 at 12:47
  • I will do it shortly. I don't have access to it as of now. – Viking22 Mar 11 '16 at 12:55
  • Hi James, added the C# Code. Basically we are using XmlTextWriter.WriteAttributeString(string, string). I guess it naturally encodes the & to & But, not sure why the browser is not able to parse it – Viking22 Mar 11 '16 at 18:43

1 Answers1

1

I think you are attacking the wrong problem here. Ampersands can (and should) be escaped in HREF tags. See this question for more details: Do I encode ampersands in <a href...>?

The Query string should really be prefixed with ?. This can be ambiguous when using client side frameworks that use #, but the rules still apply.

Try formatting your anchor like this:

<a href="https://www.google.co.in/#/?q=Pune&amp;tbm=nws"></a>
Community
  • 1
  • 1
Kevin Burdett
  • 2,892
  • 1
  • 12
  • 19