3

A client of mine is using a Classic ASP script to process a form from a third-party payment processor (this is the last step in a credit-card-transaction sequence that starts at the client's website, goes to the third-party site, and then returns to the client's site).

The client is in Austria and when one of the fields includes an 8-bit character (e.g., when the field value is Österreich), the Ö is simply dropped when I retrieve the value of the field in the standard way; e.g.:

fieldval = Request.Form("country")
If fieldval = "sterreich" Then
    ' Code here will execute
End If

The literal value that the third-party page is POSTing is %D6sterreich, which I think suggests that the POST is being encoded in UTF-8.

The POST request has the following possibly-relevant headers:

  • Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
  • Content-Type: application/x-www-form-urlencoded

I'm by no means a character-encoding expert and this is the first time I've really done anything with Classic ASP, so I'm kind of flummoxed.

From some Googling and searching SO, I've added the following to the page that processes the POST:

<%@ Codepage=65001 %>
<%
Response.CharSet = "UTF-8"
Response.Codepage = 65001
%>

But it doesn't make any difference -- I still lose that initial 8-bit character. Is there something really simple that I'm just not aware of?

Ben Dunlap
  • 1,838
  • 1
  • 16
  • 17

6 Answers6

2

Try adding the following to the top of the page:

<%
Response.CharSet = "utf-8"
Session.CodePage = 65001
%>
Simmo
  • 3,101
  • 3
  • 16
  • 15
1

Turns out I was going the wrong direction with this. The ASP file in question was itself encoded in UTF-8, which was implicitly setting Response.CodePage to 65001 -- in other words, explicitly adding a CODEPAGE directive made no difference -- and in fact the UTF-8 encoding was the source of the problem.

When I re-encoded the file to Windows-1252, the problem disappeared. I'm pretty ignorant of character encodings in general, but I think in retrospect the %D6 in the POST should have been my clue -- if I'm starting to understand things rightly, the single byte 0xD6 is not a valid UTF-8 character. Maybe someone more familiar with these things could confirm or deny this.

Ben Dunlap
  • 1,838
  • 1
  • 16
  • 17
1

What about using the Ascii Character 0 in the query string, encoded as (%00), can I retrieve the whole value without terminating by Ascii 0?

http://localhost/Test_Authentication.asp?token=%13%23%02%00%01%01%00%01%01%05%02%02%03%00%02%02%0A%0A%0A%0A%0A%0A048


Response.CharSet = "utf-8";
Session.CodePage=65001;

var strToken = (Request.QueryString("token").Count > 0)?Request.QueryString("token")(1):"";
Matthew Frederick
  • 22,245
  • 10
  • 71
  • 97
Ye Kyaw
  • 25
  • 4
0

The 2 simple steps I used were:

  1. add at the top of EVERY asp file:

    Response.CharSet = "utf-8"

    Response.CodePage = 65001

  2. save every ASP text file in "ANSI" encoding (NOT utf-8!) - this option is usually found in the "Save" window of advanced text editors

If you save in utf-8 encoding or if you don't add the two line specified at the top of your code, this will never work as you intended.

user1463699
  • 281
  • 3
  • 2
0

@Ben Dunlap: Try this at the top of the page --

<%@LANGUAGE="VBSCRIPT" CODEPAGE="65001"%>

Update
If you do a Response.Write Request.Form("country"), what does it display?

stealthyninja
  • 10,343
  • 11
  • 51
  • 59
0

My issue was similar (but quite strange) and adding the following two lines on all my pages has corrected it. Thanks so much for this.

Response.CharSet = "UTF-8"
Response.Codepage = 65001

But, to explain, here is the exact issue I had. Folks were entering Spanish characters on my ASP entry page and the results were very weird. For example" "Peña" was entered. The ASP page would display this, as entered, but what ended up in the database was displayed back as "Pe?a". This would have been sort of ok, except the hex actually stored in the database was 0x50653F6100. Notice the extra "00". Somehow the database stored value had an extra NULL at the end. So, when I later retrieved the data the screens went a little bonkers when the "00" [null] was hit and the displayed data essentially stopped after this data.

In any case adding the two lines seems to have fixed the issue and the "ñ" is stored in the database as it should be.

miked
  • 31
  • 4