1

I am processing a POST request which is encoded in UTF-8. This POST request is responsible for creating a file in some folder. However, when I look at the file names for Russian characters, I see garbage values for the file name ( file contents are ok). English characters for file names are ok. In the script I see :

Set fsOBJ= Server.CreateObject("Scripting.FileSystemObject")
Set fsOBJ= fsObj.CreateTextFile(fsOBJ.BuildPath(Path, strFileName))

I believe that 'strFileName' is my problem. Windows doesn't seem to like UTF-8 filenames. Any ideas on how to solve this.

roboto1986
  • 624
  • 2
  • 13
  • 28
  • http://stackoverflow.com/questions/916118/classic-asp-how-to-convert-a-utf-8-string-to-ucs-2/920405#920405 – Swati Jul 24 '12 at 20:21
  • Thanks for your response but I have tried this for a single letter file name 'k' and got 'Рє'. Any other ideas? – roboto1986 Jul 24 '12 at 21:23
  • Where does strFileName come from? From the POST or from a database? If it's from a database, is the column/table set to UTF-8? – TheCarver Jul 25 '12 at 00:53
  • Thanks for your response. The POST comes directly from user input from a form. I know the data always comes in as UTF-8 since I was able to properly decode the file name when it came through wireshark.Thanks. – roboto1986 Jul 25 '12 at 13:01

1 Answers1

3

VBScript strings are strictly 2-byte unicode any encoding used in storage or transmission of strings is converted to unicode before a string existing in VBScript.

My guess is you have form post carrying the file name and the post is encoded as UTF-8. However your receiving page has its CodePage set to something other than 65001 (the UTF-8 code page) at the time of decoding the the form field carrying the file name. As a result the string retrieved from the form is corrupt.

Add <%@ CODEPAGE=65001 %> to your page, include Response.CharSet = "UTF-8" in the top of the page and save it as UTF-8.

Now when the source form posts UTF-8 encoded form data to the page the form data will be decoded to unicode correctly.

AnthonyWJones
  • 187,081
  • 35
  • 232
  • 306
  • Interesting. Yes, my form POST does have the file name encoded as UTF-8 when POST happens. This I can't change as the POST comes from an embedded device.I did set the CodePage on IIS to 65001 but then my script didn't work... It only seems to work on CodePage below 65001. I did notice the same filename changing for different codepage. So perhaps all I need to do is to set it somehow to 65001 without it crashing. My primary development environment is Linux and I don't have MS VS2010 debugging tools which come with the pro-version. – roboto1986 Jul 25 '12 at 13:10
  • I should also mention that I followed http://msdn.microsoft.com/en-us/library/ms525789%28v=vs.90%29.aspx . And the only problem I'm having is setting the 65001 in IIS but I guess I won't need to if codepage is specified on top of my ASP file... – roboto1986 Jul 25 '12 at 13:49