1

I have a form that accepts text and is posted to the server.

If a user were to input a French character such as 'à', it will be read as 'Ã' by Classic ASP code and be stored as 'Ã' in a SQL Server 2005 database.

A similar affect happens to other accented characters. What's happening?

John Saunders
  • 160,644
  • 26
  • 247
  • 397
burnt1ce
  • 14,387
  • 33
  • 102
  • 162

3 Answers3

2

It's a problem of character encoding. Apparently your server and database are configured with charsets Windows-1252 or ISO-8859-1, and you're receiving UTF-8 data.

You should check that your server sends a Content-Type or a Content-Encoding header with values ending with "charset=iso-8859-1".

I guess your server doesn't send the charset of the documents, and people with default configuration set to UTF-8 send UTF-8 characters which are stored as iso-8859-1 (or Windows-1252) in your database.

FWH
  • 3,205
  • 1
  • 22
  • 17
  • According to firebug, your suspicions are correct about iso-8859-1 but it seems to also say that it accepts utf-8. If my server accepts utf-8 and what you are saying that the client sends data in utf-8, then this shouldn't be a problem, right? Header value captured by Firebug Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7 – burnt1ce Jul 21 '09 at 18:42
  • The header says it will *accept* those encodings (others may be rejected); you still need to see what you get and handle it appropriately. There is no automatic conversion between character sets. SS2005 will store the bits it gets, but the default code page in Windows (1252) is not going to work entirely correctly with ISO-8859-1, so Windows clients that don't get told the string is not CP1252 are going to have difficulty. – DaveE Jul 21 '09 at 19:03
  • For the characters in the question ISO-8859-1 and Windows-1252 use identical code points. – AnthonyWJones Jul 21 '09 at 21:11
  • If the text is encoded as ISO-8859-1 and you read it as windows-1252, there's no problem; in fact, web browsers tend do that on purpose. windows-1252 merely replaces the control characters in the 128..159 range (which are never used and wouldn't print anyway) with some more printing characters. – Alan Moore Jul 22 '09 at 06:32
0

See my answer here for the detail on what is likely happening.

Utlimately you need to ensure the encoding used in the form post matches the Response.CodePage of the receiving page. You can configure the actual character set sent by a form by placing the accept-charset attribute on the form element. The default accept-charset is the documents char-set.

What exactly do you have the ASP files codepages set to (both on the page containing the form and the page receiving the post)?

What are you setting the Response.CharSet value to in the form page?

Community
  • 1
  • 1
AnthonyWJones
  • 187,081
  • 35
  • 232
  • 306
0

I have just gone around in circles trying to fix this once and for all in my old classic asp app which uses jquery ajax posts to store info in a database. Tried every combination with no luck..

Ended up modifying any sql selects by using the stored proc mentioned here and magic happened. Data is still stored corrupted in the database, but is displayed correctly on the site.

Derrick Dennis
  • 148
  • 1
  • 4