9

We have a MySQL InnoDB table, with a text field COLLATE utf8mb4_unicode_ci. I need to search for rows that contain any emoji characters. I've searched through quite a few SO questions, but people seem to have a list of emojis they are searching for. I'm actually looking for a solution that will find ANY emoji. Here are some posts that are not helping.

This one seems to come closest to actually providing me with what I'm looking for, but the OP hasn't actually posted his search code.

Thanks!

Stefan
  • 700
  • 10
  • 22

4 Answers4

13

I've had situation where db migration from one server to another caused emoji to disappear. So I had to find all rows in original table which contained high utf8 (emoji) characters.

This query worked as expected:

SELECT field FROM `table` WHERE HEX(field) RLIKE "^(..)*F.";

before doing anything check if you are using utf8mb4 on your db, tables AND connection:

SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';

+--------------------------+--------------------+
| Variable_name            | Value              |
+--------------------------+--------------------+
| character_set_client     | utf8mb4            |
| character_set_connection | utf8mb4            |
| character_set_database   | utf8mb4            |
| character_set_filesystem | binary             |
| character_set_results    | utf8mb4            |
| character_set_server     | utf8mb4            |
| character_set_system     | utf8               |
| collation_connection     | utf8mb4_unicode_ci |
| collation_database       | utf8mb4_unicode_ci |
| collation_server         | utf8mb4_unicode_ci |
+--------------------------+--------------------+
seven
  • 2,388
  • 26
  • 28
  • Good, but not perfect. For instance, it doesn't catch ✌️ which appears to HEX's as `E29C 8CEF B88F` (spaces added for clarity) – Codemonkey Jul 18 '22 at 08:58
0

Could this work maybe ?

Use the lib_mysqludf_preg library from the mysql UDF repository for PCRE regular expressions directly in mysql

[\x{23}\x{2A}\x{30}-\x{39}\x{A9}\x{AE}\x{203C}\x{2049}\x{2122}\x{2139}\x{2194}-\x{2199}\x{21A9}-\x{21AA}\x{231A}-\x{231B}\x{2328}\x{23CF}\x{23E9}-\x{23F3}\x{23F8}-\x{23FA}\x{24C2}\x{25AA}-\x{25AB}\x{25B6}\x{25C0}\x{25FB}-\x{25FE}\x{2600}-\x{2604}\x{260E}\x{2611}\x{2614}-\x{2615}\x{2618}\x{261D}\x{2620}\x{2622}-\x{2623}\x{2626}\x{262A}\x{262E}-\x{262F}\x{2638}-\x{263A}\x{2640}\x{2642}\x{2648}-\x{2653}\x{2660}\x{2663}\x{2665}-\x{2666}\x{2668}\x{267B}\x{267F}\x{2692}-\x{2697}\x{2699}\x{269B}-\x{269C}\x{26A0}-\x{26A1}\x{26AA}-\x{26AB}\x{26B0}-\x{26B1}\x{26BD}-\x{26BE}\x{26C4}-\x{26C5}\x{26C8}\x{26CE}-\x{26CF}\x{26D1}\x{26D3}-\x{26D4}\x{26E9}-\x{26EA}\x{26F0}-\x{26F5}\x{26F7}-\x{26FA}\x{26FD}\x{2702}\x{2705}\x{2708}-\x{270D}\x{270F}\x{2712}\x{2714}\x{2716}\x{271D}\x{2721}\x{2728}\x{2733}-\x{2734}\x{2744}\x{2747}\x{274C}\x{274E}\x{2753}-\x{2755}\x{2757}\x{2763}-\x{2764}\x{2795}-\x{2797}\x{27A1}\x{27B0}\x{27BF}\x{2934}-\x{2935}\x{2B05}-\x{2B07}\x{2B1B}-\x{2B1C}\x{2B50}\x{2B55}\x{3030}\x{303D}\x{3297}\x{3299}\x{1F004}\x{1F0CF}\x{1F170}-\x{1F171}\x{1F17E}-\x{1F17F}\x{1F18E}\x{1F191}-\x{1F19A}\x{1F1E6}-\x{1F1FF}\x{1F201}-\x{1F202}\x{1F21A}\x{1F22F}\x{1F232}-\x{1F23A}\x{1F250}-\x{1F251}\x{1F300}-\x{1F321}\x{1F324}-\x{1F393}\x{1F396}-\x{1F397}\x{1F399}-\x{1F39B}\x{1F39E}-\x{1F3F0}\x{1F3F3}-\x{1F3F5}\x{1F3F7}-\x{1F4FD}\x{1F4FF}-\x{1F53D}\x{1F549}-\x{1F54E}\x{1F550}-\x{1F567}\x{1F56F}-\x{1F570}\x{1F573}-\x{1F57A}\x{1F587}\x{1F58A}-\x{1F58D}\x{1F590}\x{1F595}-\x{1F596}\x{1F5A4}-\x{1F5A5}\x{1F5A8}\x{1F5B1}-\x{1F5B2}\x{1F5BC}\x{1F5C2}-\x{1F5C4}\x{1F5D1}-\x{1F5D3}\x{1F5DC}-\x{1F5DE}\x{1F5E1}\x{1F5E3}\x{1F5E8}\x{1F5EF}\x{1F5F3}\x{1F5FA}-\x{1F64F}\x{1F680}-\x{1F6C5}\x{1F6CB}-\x{1F6D2}\x{1F6E0}-\x{1F6E5}\x{1F6E9}\x{1F6EB}-\x{1F6EC}\x{1F6F0}\x{1F6F3}-\x{1F6F6}\x{1F910}-\x{1F91E}\x{1F920}-\x{1F927}\x{1F930}\x{1F933}-\x{1F93A}\x{1F93C}-\x{1F93E}\x{1F940}-\x{1F945}\x{1F947}-\x{1F94B}\x{1F950}-\x{1F95E}\x{1F980}-\x{1F991}\x{1F9C0}]

0

I found specific regular expression for it

WHERE your_column REGEXP '[\\x{1F000}-\\x{1FFFF}]';
White Shot
  • 115
  • 4
-1

In my opinion the easy way is to create a table with all emoji codes and then make a join through like condition to your table.

I share here how to insert emotis on mysql:

create table emojis (
  e varchar(100) COLLATE utf8mb4_unicode_ci
 );

 insert into emojis values
 ( _utf8mb4 0xF09F9881 COLLATE utf8mb4_unicode_ci),
 ( _utf8mb4 '' );

enter image description here

enter image description here

The final query should look like:

select distinct yt.id
from your_table yt
inner join emojis e 
  on yt.some_column like '%' + e.e + '%'
dani herrera
  • 48,760
  • 8
  • 117
  • 177