2

Duplicate Thread Note

I already asked a similar question for this issue (Emoji data retrieved via ODBC Connection appears as question mark) but that was more looking at the setup of the classic ASP page, but I am now raising a question to ask for help with the ODBC drivers as I think the issue I have is related to that.

Environment Details

I connect to MySQL version 5.5.50-log (issue also replicated on 5.7.28-log, and on MySQL 8 as well) from classic ASP pages. All running 32 bit MySQL Server, using 32 bit ODBC Drivers.

I am running the code on a Windows 10 PC, but the live site is running on a Windows 2012 Server. The issue exists on both systems.

Issue

The issue is that when I connect to MySQL using an ODBC Driver and display the output, emoji data is displayed as a ? rather than as the emoji even though I can see the emoji is saved correctly in the database. It's just when it's extracted via ODBC and displayed on the page, it is displayed as a ? character.

MySQL Details

The MySQL Database is set up as follows:

Database charset: utf8mb4 Database collation: utf8mb4_general_ci The table and field:

Character Set: utf8mb4 Collation: utf8mb4_general_ci

The MySQL database is set correctly from what I can see, in terms of collation etc:

Variable_name             Value               
------------------------  --------------------
character_set_client      utf8mb4             
character_set_connection  utf8mb4             
character_set_database    utf8mb4             
character_set_filesystem  binary              
character_set_results     utf8mb4             
character_set_server      latin1              
character_set_system      utf8                
collation_connection      utf8mb4_general_ci  
collation_database        utf8mb4_general_ci  
collation_server          latin1_swedish_ci   

To test, I have data in the table, which contains this string: (T_T) é, è, à, ç

This is a screenshot of the data from the SQLyog:

enter image description here

Test Web Page

This is my test web page:

<!DOCTYPE html>

<html lang="en">
    <head>
        <meta charset="utf-8">
        <meta http-equiv="X-UA-Compatible" content="IE=edge">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <title>Test</title>
        <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css">
    </head>
    <body>
    <p>Pasted directly from database field: (T_T) é, è, à, ç</p>
    <p>Returned from SQL statement: ?(T_T) é, è, à, ç</p>
    </body>
</html>

More notes on the issue

The issue is not that emoji data cannot be displayed on the web page, as the directly pasted content from MySQL appears fine.

The issue is that once the data is returned from an SQL Select from MySQL via the ODBC Driver, it does not render correctly.

Notes about the Classic ASP Page Setup

I have the following set right at the top of the ASP code:

<%@LANGUAGE="VBSCRIPT" CODEPAGE="65001"%>
<%
Response.ContentType = "text/html"
Response.AddHeader "Content-Type", "text/html;charset=UTF-8"
Response.CodePage = 65001
Response.CharSet = "UTF-8"

The ASP page itself is saved as a UTF-8 encoded file:

enter image description here

I know emoji data appears fine on the page as the pasted content from MySQL renders correctly.

ODBC Driver Notes

I think the issue is to do with the ODBC Connector, but I have tried everything I can think of and nothing works.

I have tried these drivers:

  • MySQL ODBC 3.51 Driver
  • MySQL ODBC 5.1 Driver
  • MySQL ODBC 5.3 Unicode Driver
  • MySQL ODBC 8.0 Unicode Driver
  • MySQL ODBC 8.0 ANSI Driver

I have tried with Connection Strings, and DSN connections, the issue happens on all of them.

Sample connection string:

oConn.Open "DRIVER={MySQL ODBC 5.3 Unicode Driver}; port=33066; option=16387; server=localhost; database=db1083; user=cheese; password=string; option=3; charset=utf8mb4; stmt=set names 'utf8mb4';"

I have tried using DSN connections and setting the Connection properties with the correct charset enabled, and the Initial Statement set on older versions of the ODBC Driver which support using that.

I'm out of ideas now!

Could this be a bug with the ODBC drivers?

4532066
  • 2,042
  • 5
  • 21
  • 48
  • look please at this and encode your mysql reply correctly https://stackoverflow.com/questions/18186741/insert-unicode-text-in-mysql4-throught-asp-net – nbk Nov 23 '19 at 18:51
  • Thanks @nbk for your reply. Re. the link in your comment - isn't that more to do with how the data is stored in the database? The string stored in MySQL is in the correct format, my issue is when it's retrieved from the database via the ODBC connector, the output is not correct, despite the connection string or DSN being set to use the correct character set of `utf8mb4`. Thanks – 4532066 Nov 23 '19 at 20:43
  • you should try the DecodeFromUtf8 which "decodes" utf8 from mysql to something representing unicode, which microft uses – nbk Nov 23 '19 at 20:46
  • Does this answer your question? [Emoji data retrieved via ODBC Connection appears as question mark](https://stackoverflow.com/questions/58943772/emoji-data-retrieved-via-odbc-connection-appears-as-question-mark). Please do not post duplicate questions, saying it's *"similar"* is a bit of a stretch. – user692942 Nov 25 '19 at 08:34
  • @nbk this isn't a .Net question, Classic ASP and ASP.Net are completely different. – user692942 Nov 25 '19 at 09:10
  • 1
    look i answered a similar question for vb 6 not a week a go it it always the same independent of the language, connect with utf8 in the connection string. and convert utf8 to unicode and vice versa. When this works in vb 6 it will work in asp net asp, c# vb net ..... – nbk Nov 25 '19 at 09:42

2 Answers2

2

As far as I know, the only way to retrieve Unicode supplementary characters (e.g., emoji) with ADODB and MySQL Connector/ODBC is to retrieve them as bytes — SELECT CAST(column_name AS BINARY) — and then decode them. The following example is actually VBA, but tweaking it for VBScript is trivial.

Option Explicit

Sub demo_retrieve_utf8mb4_value()
    Dim conn As New ADODB.Connection
    conn.Open _
        "DRIVER=MySQL ODBC 8.0 Unicode Driver;" & _
        "SERVER=localhost;PORT=3307;" & _
        "UID=root;PWD=(whatever);" & _
        "DATABASE=mydb;" & _
        "charset=utf8mb4;"
    Dim rst As New ADODB.Recordset
    rst.Open "SELECT CAST(emoji AS BINARY) FROM emoji_tbl WHERE id=1", conn
    Dim s As Variant
    s = decode_utf8(rst.Fields(0).Value)
    '
    ' do stuff with the string value

End Sub

Private Function decode_utf8(field_value As Variant) As Variant
    If IsNull(field_value) Then
        decode_utf8 = Null
    Else
        Dim strm As New ADODB.Stream
        With strm
            .Type = adTypeBinary
            .Open
            .Write field_value
            .Flush
            .Position = 0
            .Type = adTypeText
            .Charset = "UTF-8"
            decode_utf8 = .ReadText
            .Close
        End With
        Set strm = Nothing
    End If
End Function
Gord Thompson
  • 116,920
  • 32
  • 215
  • 418
  • Is that a MySQL provider limitation with ADODB? – user692942 Nov 25 '19 at 09:11
  • 1
    @Lankymart - Yes. ADODB and the MySQL Connector/ODBC Unicode driver can "play nice" together when the Unicode characters are within the BMP, but they can't seem to get along when it comes to supplementary characters like emoji. – Gord Thompson Nov 25 '19 at 13:13
  • Note that the latest version of the MySQL ODBC 8 Driver (8.0.27) now correctly converts such characters (see https://dev.mysql.com/doc/relnotes/connector-odbc/en/news-8-0-27.html). I've tested it and it works correctly. – Toothbrush Dec 15 '21 at 14:46
0

The latest version of the MySQL ODBC connector has fixed this bug!

The MySQL Connector/ODBC Driver 8.0.27 now correctly converts such characters (see https://dev.mysql.com/doc/relnotes/connector-odbc/en/news-8-0-27.html).

You will need to specify MySQL ODBC 8.0 Unicode Driver as the ODBC driver.

Toothbrush
  • 2,080
  • 24
  • 33