1

I have a .NET forms application that communicates with some CGI Perl scripts on a webserver.

One of the forms application's features is, it sends the text contents of a large "Description" field to the server, for storage in the database. This large string value is sent as a parameter on the request (i.e. text=whatever).

To do this, the first thing we've been doing is running the entire list of parameters on the request through System.Web.HttpUtility.UrlEncode.

Then, we hand off to the method that builds the request. ContentType gets set to "application/x-www-form-urlencoded;charset=UTF-8". We then encode the parameters as UTF-8 right before sending:

byte[] bytes = Encoding.UTF8.GetBytes(parameters);

This has all worked gloriously for years. But recently, the powers that be wanted to know if we could send GERMAN text in that field. So, we've tried that. But it ends up being stored in the database incorrectly. And when my app queries back for it later on, we're seeing gibberish where some of the German language's special characters would be.

The webserver guy is blaming me / my code. Now, I took a look at one particular array of byte[] that we were sending, converted all the decimals to binary...and then looked them up on a unicode table. And everything matches my parameter string just fine.

BUT... in looking further at that unicode table... I saw some funky characters...and I'm thinking the German special characters are probably in there.

So, I'm wondering if I really NEED to be urlencoding my POST data before sending to the server in this situation? I guess I thought urlencoding ALL POST data was required on the server side for this content-type.

Thanks!

itsme86
  • 19,266
  • 4
  • 41
  • 57
DaveyBoy
  • 435
  • 4
  • 20
  • For that content type, yes. You might want to check this answer: http://stackoverflow.com/a/10002620/1141432 and the answer it links to. – itsme86 Aug 22 '16 at 16:24
  • Testing now by removing the urlencode call. And now it appears to not be escaping the & character. That seems to be breaking it. – DaveyBoy Aug 22 '16 at 16:30
  • I've substituted uri.encodedatastring for now on that text field...and that seems to be working during initial testing. (I misspoke above; we were not urlencoding everything, just the user-inputted text fields... of which there are only a small number.) I've agreed to run with this for now... and come back to it if we encounter any issues. I still think this was a server problem, as everything I've seen so far suggests that the server should have interpreted those urlencoded characters just fine, given the ContentType we were sending. – DaveyBoy Aug 22 '16 at 17:01
  • Nope, it turns out that using ui.escapedatastring does not work after all. Tested it today with some different sample text and it appears to be making conversions on some of the special characters. – DaveyBoy Aug 23 '16 at 15:42
  • For what it's worth: I suddenly discovered that if I did NOT urlencode the data, all foreign language-characters were being processed by the webserver, its cgi script, and DB (none of which I myself have access to) just fine. I believe what may have been causing confusion was that the .cgi script owner had updated his scripts that receive the data from me, but perhaps had not updated his scripts that send the data back to me...to be sure to send UTF-8. And that had made me think I wasn't sending the data correctly. – DaveyBoy Aug 24 '16 at 15:40

0 Answers0