1

I have a function for parameterized SQL entries in ASP Classic that's working great most of the time. The problem is that it's not handling special characters such as å, ä and ö properly. It simply cuts off the string where those characters are. I have set UTF-8 in all ASP-files involved with codepage and charset set and in the database I am using UTF8mb4_unicode_ci and in some cases UTF8mb4_swedish_ci. UTF8 should be working as intended and this function is the only thing at the moment that doesn't seem to handle it. I have checked and the parameter that I'm sending in contains the characters correctly inside the function, both before and after the actual execution.

Here is the function I'm using:

function dbPara(sqlPara, params)
dim cmd
set cmd = Server.CreateObject("ADODB.Command")
with cmd
    .CommandText = sqlPara
    set .ActiveConnection = cn
end with

if NOT isEmpty(params) then
    set rsPara = cmd.execute(, params)
else
    set rsPara = cmd.execute()
end if

response.write(params(0)) 'For test purpose only, this will output the entire string.

set dbPara = rsPara
end function

And here's the code of a not properly working example using the function, and one working where it's not parameterized:

Set cn = Server.CreateObject("ADODB.Connection")
cn.Open "Driver={MySQL ODBC 8.0 Unicode driver};Database="&db&";Server="&dbserver&";UID="&dbuser&";PWD="&dbpassword&""

comment = "This is a test to see if å is working properly."
sqlA = "UPDATE orders SET orders_comment = ? WHERE orders_id = 1234"
call dbPara(sqlA, array(comment))
'This will only set the orders_comment to everything up to the å

cn.execute("UPDATE orders SET orders_comment = '" & comment & "' WHERE orders_id = 1234)
'This will set the orders_comment correctly to the whole string

Could anyone see any potential problems with this function causing this behaviour?

EDIT: Adding an example with more than one entry to explain my comment about the first variable having special characters replaced with ?, but second variable getting its entry cut off. This is when I set the connection string to use charset=utf8; when charset=ucs2;as suggested by Lankymark didn't work:

    langSQL = "UPDATE faq_entry_lang SET faq_entry_lang_title = ?, faq_entry_lang_content = ? WHERE faq_entry_lang_entryID = ? AND faq_entry_lang_lang = ?"
    call dbPara(langSQL,array(faqtitle, faqcontent, entryid, lang))

EDIT 2: After more testing. It seems random when it cuts off and when it does not using utf8 in the connection string. All special characters that are not cut off will turn into ? though.

Daniel Nordh
  • 362
  • 3
  • 15
  • Relevant perhaps? [ASP, MySQL and UTF-8](//stackoverflow.com/a/14015499) – user692942 Feb 07 '20 at 11:55
  • Great suggestion. Unfortunately it didn't work with `charset=ucs2;`, which gave the same problem. Tested with `charset=utf8;` and ran into a different problem where the first variable had the special characters converted to question marks (`?`), but the second variable was cut off at the first special character just as before. Added an example of where this would happen since my original example only provided one variable. – Daniel Nordh Feb 07 '20 at 13:10
  • After some more testing. It seems random when it wants to cut off. Sometimes it cuts off after a few special characters, sometimes at the first and sometimes not at all on both variables. So I was wrong in my first assessment that it only cut off straight away on the second variable when using `utf8`. (I've also tried `utf8mb4` which always cuts off straight away, just like `ucs2`) – Daniel Nordh Feb 07 '20 at 13:24
  • 3
    Possibly related: https://bugs.mysql.com/bug.php?id=69021 TL;DR: Full support for utf8mb4 (including supplementary characters like emoji) can be achieved by using the "MySQL ODBC 8.0 **ANSI** Driver" and UTF-8 string encoding. It seems weird, but it worked for me with pyodbc and MySQL Connector/ODBC. – Gord Thompson Feb 07 '20 at 13:51
  • 3
    Gord, thank you! Tried it with `charset=utf8` and `utf8mb4` active with no luck. But just for the sake of it I removed the charset option again all together, and it worked! Now I have to research any potential side effects if I implement ANSI instead of Unicode sitewide. So am I correct in understanding this to be a bug that will most likely not be fixed considering how many years ago it was reported and verified. – Daniel Nordh Feb 07 '20 at 13:59
  • 2
    "So am I correct in understanding this to be a bug that will most likely not be fixed considering how many years ago it was reported and verified." - That sounds like a pretty safe assumption to me. – Gord Thompson Feb 07 '20 at 20:16
  • Gord, feel free to add your comment as an answer and I'll mark is as accepted. – Daniel Nordh Feb 10 '20 at 07:18

1 Answers1

0

Please provide SHOW CREATE PROCEDURE name

Then look at the end -- it will tell you what CHARACTER SET and COLLATION were in effect when you created the proc (or FUNCTION).

I suspect it is not what you need. To fix:

DROP PROCEDURE name;
SET NAMES utf8mb4 COLLATE utf8mb4_unicode_520_ci;  -- (or whatever)
DELIMITER //
CREATE PROCEDURE
    ...... ;
//
DELIMITER ;

More discussion: Search for "truncate" in Trouble with UTF-8 characters; what I see is not what I stored

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Thanks for answering, Rick. The solution to my problem could be found in the comments to the question and was an old bug as part of the connection driver protocol to the MYSQL server. Perhaps this will help someone else though. :) – Daniel Nordh Feb 10 '20 at 07:17