22

THE SITUATION:

Sorry in advance if this question has already been asked, but the solutions aren't working for me.

No matter what I try, I cannot store emoji in my database. They are saved as ????.
The only emojis that are properly saved are the ones that require only 3 bytes, like the shy face or the sun.

The actual utf8mb4 is not working.

Database screenshot

It has been tested on both Android and Ios. With same results.

VERSIONS:

Mysql: 5.5.49
CodeIgniter: 3.0.0

THE STEPS:

  1. I have modified database character set and collation properties.

    ALTER DATABASE my_database CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci

  2. I have modified table character set and collation properties.

    ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci

  3. I have set each field of the table, where possible, as Encoding: UTF-8(ut8mb4) and Collation: utf8mb4_unicode_ci

  4. I have modified the database connection in the CodeIgniter app.

  5. I have run the following: SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci

  6. Lastly I have also tried this: REPAIR TABLE table_name; OPTIMIZE TABLE table_name;

Everything should have been setup properly but yet it doesn't work.

DATABASE SETTINGS:

This is the outcome running the following command:

`SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';`

Database settings

TABLE SETTINGS:

A screeshot of the table structure:

Table settings

DATABASE CONNECTION:

These are the database connection settings inside database.php (note this is not the only database, there are also others that connect using utf8)

$db['my_database'] = array(
        'dsn'           => '',
        'hostname'      => PROJECT_DATABASE_HOSTNAME,
        'username'      => PROJECT_DATABASE_USERNAME,
        'password'      => PROJECT_DATABASE_PASSWORD,
        'database'      => PROJECT_DATABASE_NAME,
        'dbdriver'      => 'mysqli',
        'dbprefix'      => '',
        'pconnect'      => FALSE,
        'db_debug'      => TRUE,
        'cache_on'      => FALSE,
        'cachedir'      => '',
        'char_set'      => 'utf8mb4',
        'dbcollat'      => 'utf8mb4_unicode_ci',
        'swap_pre'      => '',
        'encrypt'       => FALSE,
        'compress'      => FALSE,
        'stricton'      => FALSE,
        'failover'      => array(),
        'save_queries'  => TRUE
    );

MY.CNF SETTINGS:

This is the whole content of the file my.cnf:

[mysqld]
default-storage-engine=MyISAM
innodb_file_per_table=1
max_allowed_packet=268435456
open_files_limit=10000
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci

[client]
default-character-set = utf8mb4

[mysql]
default-character-set = utf8mb4

THE QUESTION:

Do you know why is not working? Am I missing something?

HYPHOTESIS 1:

I am not sure, but the cause of the problem may be this:

As you can see in my.cnf character-set-server is clearly set as utf8mb4:

But after running the query in the database:

SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';

The outcome is that character-set-server = latin1

Do you know why is that? Why is not actually updating?

HYPHOTESIS 2:

The application use several different databases. This one is set to utf8mb4 but all the others are set to utf8. It may be a problem even if they are separated databases?

Thank you!

EDIT:

This is the outcome of SHOW CREATE TABLE app_messages;

CREATE TABLE `app_messages` (
  `message_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `project_id` bigint(20) NOT NULL,
  `sender_id` bigint(20) NOT NULL,
  `receiver_id` bigint(20) NOT NULL,
  `message` text COLLATE utf8mb4_unicode_ci,
  `timestamp` bigint(20) DEFAULT NULL,
  `is_read` enum('x','') COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`message_id`)
) ENGINE=InnoDB AUTO_INCREMENT=496 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

EDIT 2:

I have run the following command:

INSERT INTO app_messages (message_id, project_id, sender_id, receiver_id, message, timestamp, is_read)
VALUES ('496','322','77','188', '' ,'1473413606','x');

And other two similar with and

They were inserted in the table without problems:

enter image description here

But in the actual app what i really see is: ? (this time only one ? and not 4)

FrancescoMussi
  • 20,760
  • 39
  • 126
  • 178
  • 1
    Looks like you may have already read this, if not here: https://mathiasbynens.be/notes/mysql-utf8mb4. Qs: From where are you running the `SHOW VARIABLES ...` query? Command line? PHPMyAdmin? For example, running the the command via PHPMyAdmin may produce some values that you don't expect because you are using the phpmyadmin connection so the values depend on how phpmyadmin is connecting. – Gerard Roche Sep 09 '16 at 04:54
  • Hello @johnnyfittizio what Collation you are using for message field ? – Gorakh Yadav Sep 09 '16 at 05:17
  • @GerardRoche Yes I am running that in PHPMyAdmin, so should I not worry about that outcome? Yes I have followed that tutorial, but unfortunately is not working. – FrancescoMussi Sep 09 '16 at 09:14
  • 1
    Looks like the server is not configured properly. I checked locally and even from PHPMyAdmin my configurations were all correct. Double check the configurations because yours are clearly not correct. They should be exactly like in the article. Debugging: ensure the order of the sections is `[client]`, `[mysql]`, `[mysqld]` like in the article (not that it probably matters), ensure mysql is restarted (not just reloaded). Show the database structure, does it have the correct charset and collation. Try the `SHOW VARIABLES ...` query via the CLI. – Gerard Roche Sep 09 '16 at 09:45
  • Yes there must be something wrong. In my.cnf i set character-set-server as utf8mb4 but when I check it is latin1. So there is something overriding it. I will try to contact the server. Btw do you know how can I run that query SHOW VARIABLES via CLI? – FrancescoMussi Sep 09 '16 at 10:04
  • If you didn't restart the server after your changes then that's the issue. CLI usage: https://dev.mysql.com/doc/refman/5.7/en/mysql.html – Gerard Roche Sep 09 '16 at 10:13
  • What OS? Linux? I've just noticed that you're editing `my.cnf`, on linux you should edit/create `conf.d/90-my.cnf`. The settings in that other file are loaded first, then the `conf.d/*` ones. – Gerard Roche Sep 09 '16 at 10:18
  • Ok thank you. I will try this. – FrancescoMussi Sep 09 '16 at 10:46
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/123041/discussion-between-gerard-roche-and-johnnyfittizio). – Gerard Roche Sep 10 '16 at 08:29

7 Answers7

3

Okay I finally managed to make it working! Thanks to everybody that tried to help me, especially @Rick James and @Gerard Roche.

SUGGESTION:

If you need to work with emoji first of all make simple tests on localhost. Create a new database and make a fresh app for testing purpose.

If you follow the steps I wrote in the question or if you follow this tutorial: https://mathiasbynens.be/notes/mysql-utf8mb4#utf8-to-utf8mb4 it must work.

Working locally on a fresh basic app you will have more control and more room to make all the test you need.

SOLUTION:

In my case the problem was in the configuration of the database in CodeIgniter. It was not properly setting up the char_set and the collation for a stupid overlooking: I was overriding the database settings in the function that save messages to be sure it was working with the mobile database.

BEFORE:

function message_save ( $data = FALSE )
{   
    $project_db_config                  = array();
    $project_db_config['hostname']      = 'MY_HOST';
    $project_db_config['username']      = 'MY_USERNAME';
    $project_db_config['password']      = 'MY_PASSWORD';
    $project_db_config['database']      = 'MY_DATABASE';

    $mobile_db                          = $this->load->database( $project_db_config, TRUE );

    // other code to save message       
}

AFTER:

function message_save ( $data = FALSE )
{
    $mobile_db_connection = $this->load->database('admin_mobile_mh', TRUE);

    // other code to save message
}

CONCLUSION:

The app must set the connection to the database properly. If you have the database properly setup but you don't make the proper connection with your app, it won't work.

So if you encounter similar problems make sure the api properly setup the char_set as utf8mb4 and db_collat as utf8mb4_unicode_ci.

FrancescoMussi
  • 20,760
  • 39
  • 126
  • 178
2

The only way I know of to get ???? for an Emoji is to not have the column declared utf8mb4. I understand that you have apparently determined that the column is declared that way, but please run SHOW CREATE TABLE table_name; to further confirm it.

The system default, the database default, and the table default are irrelevant if the column overrides the CHARACTER SET.

A note to all the other attempted answers: The COLLATION is irrelevant, only the CHARACTER SET is relevant for this question.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Thanks for reply. I have edited the question including the outcome of SHOW CREATE TABLE. Is that correct? – FrancescoMussi Sep 10 '16 at 10:12
  • `CREATE TABLE` looks correct. Next thing to check: `SELECT message, HEX(message) ...` to look at the hex for the question marks. I expect to see `3F3F3F3F` for the `????`, implying that the problem occurred during the _insert_. – Rick James Sep 10 '16 at 17:16
  • What is creating the Emoji? Could it be `????` _before_ the insert? Is there any way to rerun the insert and get the hex _before_ the insert? – Rick James Sep 10 '16 at 17:18
  • Thanks I have edited the question, adding the part where i insert in the table some emoji that were properly inserted in the table, but now shown (taken from the table) properly. But sorry I don't know how to properly run the `HEX`. Is not working. – FrancescoMussi Sep 11 '16 at 13:25
1

my.cnf is loaded first, then conf.d/*.cnf.

Instead of modifying my.cnf *(which may be overridden by configurations in conf.d/*.cnf), create a custom override configuration e.g. conf.d/90-my.cnf.

Prefixing 90 ensures the custom settings are loaded last which means they overwrite any earlier set settings.

To ensure the new configuration is reloaded, see Reload Without Restarting the MySQL service.

Example Configuration Structure (Linux)

.
├── conf.d
│   ├── 90-my.cnf
│   ├── conn.cnf
│   ├── my5.6.cnf
│   └── mysqld_safe_syslog.cnf
├── debian.cnf
├── debian-start
└── my.cnf

conf.d/90-my.cnf

# https://mathiasbynens.be/notes/mysql-utf8mb4
# http://stackoverflow.com/q/3513773/934739

[client]
default-character-set = utf8mb4

[mysql]
default-character-set = utf8mb4

[mysqld]

character-set-client-handshake = FALSE

# The server character set and collation are used as default values if the
# database character set and collation are not specified in CREATE DATABASE
# statements. They have no other purpose.
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
Community
  • 1
  • 1
Gerard Roche
  • 6,162
  • 4
  • 43
  • 69
0

Instead of the varchar you can change the Table filed value as follows to utf8mb4

Make sure all your tables' default character sets and text fields are converted to utf8mb4, in addition to setting the client & server character sets, e.g. ALTER TABLE mytable charset=utf8mb4, MODIFY COLUMN textfield1 VARCHAR(255) CHARACTER SET utf8mb4,MODIFY COLUMN textfield2 VARCHAR(255) CHARACTER SET utf8mb4; and so on.

Naresh Kumar P
  • 4,127
  • 2
  • 16
  • 33
0

hi i have used EMOJI in android and i stored it to orm database using EMOJI_INDEX.I saved in db in normal message in string form but when i get that time i check if there is any emoji then convert it into there processemoji.

textMessage.setText(getItem(pos).file != null ? "":EmojiUtil.getInstance(context).processEmoji(getItem(pos).message, textMessage.getTextSize()));

Take a look from here how i changed Emoji_Index to process

if (emojiImages == null || emojiImages.isRecycled()) {
        InputStream localInputStream;
        try {
            localInputStream = context.getAssets().open("emoji/emoji_2x.png");
            Options opts = new Options();
            opts.inPurgeable = true;
            opts.inInputShareable = true;
            emojiImages = BitmapFactory.decodeStream(localInputStream, null, opts);
        } catch (IOException e) {
            return Html.fromHtml(paramString);
        }
    }

For more information take a look from here. Thanks hope this will help you.

Saveen
  • 4,120
  • 14
  • 38
  • 41
0

I had a problem with the server version, on linux. I had to change the file database_interface.lib.php manually and around this

if (!PMA_DRIZZLE) { if (! empty($GLOBALS['collation_connection'])) {

change it so that, is becomes this: ( note the utf8mb4_unicode_ci references )

    // Skip charsets for Drizzle
if (!PMA_DRIZZLE) {
    if (! empty($GLOBALS['collation_connection'])) {
        PMA_DBI_query("SET CHARACTER SET 'utf8mb4';", $link, PMA_DBI_QUERY_STORE);
        $set_collation_con_query = "SET collation_connection = '"
            . PMA_Util::sqlAddSlashes($GLOBALS['collation_connection']) . "';";
        PMA_DBI_query(
            $set_collation_con_query,
            $link,
            PMA_DBI_QUERY_STORE
        );
    } else {
        PMA_DBI_query(
            "SET NAMES 'utf8mb4' COLLATE 'utf8mb4_unicode_ci';",
            $link,
            PMA_DBI_QUERY_STORE
        );
    }
}
Miguel
  • 3,349
  • 2
  • 32
  • 28
-2

Updated answer

You can try charset utf8 collation utf8_unicode_ci instead of utf8mb4_unicode_ci.

run this query

ALTER TABLE table_name CHANGE `column_name` `column_name` TEXT CHARSET utf8 COLLATE utf8_unicode_ci;

old answer You should use collation utf8mb4_bin instead of utf8mb4_unicode_ci.

run this query

 ALTER TABLE table_name CHANGE `column_name` `column_name` TEXT CHARSET utf8mb4 COLLATE utf8mb4_bin;

Emojis will be stored as code and converted into emojis again in Android and iOS apps. I have used this code in my projects as well.

Gorakh Yadav
  • 304
  • 5
  • 19
  • Thanks for reply. I have tried to put collation utf8mb4_bin for the message field but unfortunately they are still saved as ???? – FrancescoMussi Sep 09 '16 at 09:20
  • Collation is used for character **comparison** in character **set**. Your answer is wrong, and I'm downvoting it because, well, it's wrong. Collation has literally as much to do with this problem as the glass of water I spilled 2 minutes ago. – N.B. Sep 09 '16 at 18:14