2

I am working with a SQL Server 2008 database on a Windows 2008 Server. Anytime I try to store an emoji in my table it converts it to a weird looking box. When I try to store the same emoji in SQL Server 2012 it stores the emoji fine. Is it not possible to store emojis correctly in SQL Server 2008? I really cannot update at this point so that would not be an option.

Solomon Rutzky
  • 46,688
  • 9
  • 128
  • 171
Dwight
  • 93
  • 1
  • 9
  • 3
    What is the datatype of the field? – Matthew Whited Dec 19 '16 at 21:58
  • 2
    Some code with what you're doing now would be nice. The emoji will 1) need Unicode storage and 2) will possibly use a surrogate pair, and SQL Server's support for surrogates is spotty and collation-dependent (see also [this](https://connect.microsoft.com/SQLServer/feedback/details/456267/sql-server-cant-handle-surrogate-pair)). With some care, you should be able to store & retrieve, but your present approach may just not be careful enough. Or maybe everything is actually *already* working fine and you're just getting confused by your font not having support. – Jeroen Mostert Dec 19 '16 at 22:03
  • 2
    Many open questions: How are you storing/reading/displaying your emojis (please show code!)? Where and in which encoding/format do you store this (please show table structure(s)!)? – Shnugo Dec 19 '16 at 22:04
  • @MatthewWhitedI am strong it as nvarchar. – Dwight Dec 20 '16 at 14:10
  • 1
    @JeroenMostert Thanks for the response. I am simply calling a Stored Procedure in my vb.net code and inserting the record. It si definitely not the font I do not think. I am able to store the emojis successfully using my test system which is using SQL Server 2012 and the same code. – Dwight Dec 20 '16 at 14:14
  • https://stackoverflow.com/questions/33938445/add-emoji-emoticon-to-mssql-table Solved already. – Neneil Oct 29 '17 at 20:27
  • @JeroenMostert I think you were on the right track with the font idea :) Please see my answer for details. – Solomon Rutzky May 12 '19 at 21:46

1 Answers1

3

What we know based on details from the question and comments on the question:

  1. Column is NVARCHAR
  2. Value is inserted from VB.NET app via stored procedure
  3. App hitting SQL Server 2008 (running on Windows 2008 Server) stores emoji character but "converts it to a weird looking box"
  4. Same app code hitting SQL Server 2012 stores the same emoji character just fine

What we do not know:

How is the character being retrieved in order to determine whether or not it was stored correctly?

  • Are you viewing it in the app or in SSMS?
  • If in SSMS, are you connecting to SQL Server 2008 and 2012 using the same SSMS running on the same machine? Or are you using the version of SSMS that came with each version of SQL Server (hence they are not the same program, even if on the same machine)?

Based on the above:

Most likely this is a font issue. I say this due to:

  1. If it were an issue of not supporting Unicode, then you would be seeing two question marks ?? (one for each surrogate character) instead of a single square box.
  2. Emojis are nothing special. They are merely supplementary characters. And there are currently (as of Unicode v 12.0) 72,457 supplementary characters defined (and slots for another 976,119).
  3. Supplementary Characters (emojis or otherwise) can be stored in NCHAR, NVARCHAR, and NTEXT columns without a problem, and without regard to the collation of the column or the current database.

To test this, I executed the following in a database having a default collation of SQL_Latin1_General_CP1_CI_AS, so there is definitely no "supplementary character support" there.

SELECT NCHAR(0xD83D) + NCHAR(0xDE31) AS [ScreamingFace],
       NCHAR(0xD83D) + NCHAR(0xDDFA) AS [WorldMap],
       NCHAR(0xD83D) + NCHAR(0xDF08) AS [Alchemical Symbol for Aqua Vitae];

It returns:

ScreamingFace    WorldMap    Alchemical Symbol for Aqua Vitae
                        

I see different things in different areas, all due to font differences. The chart below indicates what I am seeing:

LOCATION        FONT            Screaming     World     Alchemical Symbol
                                Face          Map       for Aqua Vitae
------------    ------------    ----------    ------    ----------------------------
Text Editor     Consolas        Yes           Yes       Square box w/ question mark
Grid Results    Code2003        Yes           Yes       Yes
Text Results    Courier New     Yes           Yes       Empty square box

Most likely you were using two different versions of SSMS, or at least SSMS on two different computers. In either case, you probably had different fonts mapped to the Grid Results, or were even using Grid Results on one and Text Results on the other.

In the end, if you want to know if data was stored correctly, you need to check the bytes that were stored. To do this, simply convert the string column to VARBINARY(MAX):

SELECT CONVERT(VARBINARY(MAX), string_column)
FROM schema.table;

And compare those results between the 2008 and 2012 systems. More than likely they are (or "were" given that this was almost 2.5 years ago) the same.


For more info on what characters can actually be stored in the various string datatypes in SQL Server (from SQL Server 7.0 through at least SQL Server 2019), please read the following post of mine:

How Many Bytes Per Character in SQL Server: a Completely Complete Guide

Solomon Rutzky
  • 46,688
  • 9
  • 128
  • 171