1

I am working on a very large corporate project. It is a C#, .NET Entity Framework site with a Microsoft SQL server behind it.

The input I am testing is inside a form like so:

@using (Html.BeginForm("CreateAgency", "Agency", FormMethod.Post, new { enctype = "multipart/form-data", accept_charset = "utf-8" }))
{
...
<li>
<label id="lblAgency" class="AgencyEditorPanelItemsLabel">Agency Name</label>
@Html.TextBoxFor(m => m.AgencyName, new { @class="newAgencyTextInput"})
</li>
...

This goes back to an AgencyController where it gets parsed:

NameValueCollection nvc = HttpUtility.ParseQueryString(String.Empty);
nvc.Add("AgencyName", agency.AgencyName);
...

Then it gets packed into a string for posting through an API:

//make the post object readily convertable to a string
StringBuilder sb = new StringBuilder();
sb.Append(nvc.ToString());
//encode collection for pending transmission
var data = Encoding.UTF8.GetBytes(sb.ToString());

Is there anything in the code that I have shown that would cause UTF-8 characters to get turned into "%u00a5 %u00b7 %u00a3 %u00b7 %u20ac %u00b7 $ %u00b7 %u00a2 %u00b7 %u20a1 %u00b7 %u20a2 %u00b7 %u20a3 "? I've tried even adding a globalization key to my webconfig like so:

<globalization fileEncoding="utf-8" requestEncoding="utf-8" responseEncoding="utf-8" />

But that did nothing. Does anyone have some tips or ideas of things I could try to fix this problem?

Thank you very much for any help!

  • 1
    Show us the code you are using to write the string to the database. Additionally `Encoding.UTF8.GetBytes(sb.ToString());` should not be needed - ADO.NET and SQL Server know how to handle strings and nvarchar - you can just save the value directly you don't need to convert it to a byte array. – mjwills Jun 14 '17 at 22:53
  • OK, After tracing down variables, I've determined the characters break at these two lines: `NameValueCollection nvc = HttpUtility.ParseQueryString(String.Empty); nvc.Add("AgencyName", agency.AgencyName);` There must be something wrong here, but what? – Benjamin Barney Jun 14 '17 at 23:06
  • Looks like this is similar to this question which I just found: [ParseQueryString always encodes special characters to unicode](https://stackoverflow.com/questions/26789168/httputility-parsequerystring-always-encodes-special-characters-to-unicode) – Benjamin Barney Jun 14 '17 at 23:11
  • Why are you using nvc ? You should not be trying to insert nvc (or nvc.ToString()) into the database. That variable / code / concept is for reading from / writing to querystrings, not databases. Talk us through the problem you are trying to solve using ParseQueryString and the nvc variable. – mjwills Jun 14 '17 at 23:15
  • Please excuse my poor `C#` skills but if I emulate your `.NET` code in PowerShell then `$nvc.ToString()` contains `%u00a5%u00b7%u00a3%u00b7%u20ac%u00b7%24%u00b7%u00a2%u00b7%u20a1%u00b7%u20a2%u00b7%u20a3` i.e. non-standard encoding for Unicode characters as `%uxxxx`. However, `$nvc['Agencyname']` returns `¥·£·€·$·¢·₡·₢·₣`. HTH. – JosefZ Jun 15 '17 at 08:58
  • mjwills good point, but I didn't write this code and neither did anyone still working here so we wouldn't know. JosefZ, thanks for your comment. The whole project is being rewritten now so we will do a better job this time in any case – Benjamin Barney Jun 19 '17 at 21:16

1 Answers1

0

For the benefit of anyone else that has a similar problem, it turns out the server was encoding it with %s as a URI. Since we don't require that functionality I added the following to our web.config file:

<appSettings>
<add key="aspnet:DontUsePercentUUrlEncoding" value="true" />

This stops automatic URL encoding, so if you need that functionality don't forget to encode URLs that need it manually.