1

I have a website where charset=UTF-8 on most of the asp.net pages.

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

If I wait some hours and load my page, characters appear as UTF-8. Then as soon as I refresh the page or redirect to the page, it appears as ASCII:

Léon changes to Léon

My header looks like this:

HTTP/1.1 200 OK
Cache-Control: no-cache
Pragma: no-cache
Content-Type: text/html; charset=utf-8
Expires: -1
Last-Modified: 7/29/2012 3:16:39 PM
Server: Microsoft-IIS/7.5
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
X-Powered-By-Plesk: PleskWin
Date: Sun, 29 Jul 2012 15:16:39 GMT
Content-Length: 32795

In IIS, page content is configured to be shown by default as UTF-8 too. I removed everything I can from my asp.net page, and my utf-8 string still appears broken.

walrii
  • 3,472
  • 2
  • 28
  • 47
Léon Pelletier
  • 2,701
  • 2
  • 40
  • 67
  • Is the client sending an Accept-Charset header or an Accept header with a Charset value? See http://stackoverflow.com/questions/7055849/accept-and-accept-charset-which-is-superior – walrii Jul 29 '12 at 15:53
  • @walrii, this should not matter, because the server clearly specifies UTF-8 in its response header. – Jukka K. Korpela Jul 29 '12 at 16:08
  • I’m afraid the data is insufficient for analysis and solution. When “é” gets changed to “é”, then clearly an UTF-8 encoded character, bytes C3 A9, gets interpreted as being in ISO-8859-1 or similar encoding (e.g., windows-1252). This should not happen, as the Content-Type header should trump anything on the page itself. So it seems that in the server, data gets munged somehow: octets internally interpreted as e.g. ISO-8859-1, then UTF-8 encoded. – Jukka K. Korpela Jul 29 '12 at 16:12
  • Yeah, strange. What I pasted is the response, from Wireshark – Léon Pelletier Jul 29 '12 at 16:24
  • But on the request, there's this: `Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3` – Léon Pelletier Jul 29 '12 at 16:24
  • Wow, I just waited 1 hour, then refreshed (F5), and it loads with UTF-8, exactly as I told. Same cookies, but 1 hour later = Back to UTF-8. Then F5 turn it instantly into ISO-8859-1. I just tested after 5 minutes, and it is shown with UTF-8 too. Only refreshes after a short delay creates this problem. Maybe there's a timer somewhere in the header? – Léon Pelletier Jul 29 '12 at 17:47
  • Unless you are seeing ISO-8859-1 being reported in the HTTP header, then this sounds more like a browser display issue than a server charset issue. – Remy Lebeau Aug 01 '12 at 02:00
  • You are probably hitting the same problem we are hitting with Cyrillic character on the edge of the buffers in IIS. http://forums.iis.net/p/1225225/2101801.aspx?UTF+8+characters+on+the+edge+of+GENERAL_RESPONSE_ENTITY_BUFFER+are+shown+as+rhombus+with+questionmarks – Maxim V. Pavlov May 28 '15 at 13:03
  • Cool, 2012. Thank you.:) – Léon Pelletier May 28 '15 at 15:34

0 Answers0