I’d love to stand corrected, but I assume you’ll be out of luck here.
Emojis are a wild beast, character-wise. There are several potential groups, that you’d all need to cover:
- single-character emojis, e.g. Wrapped Gift, U+1F381. Those are emojis, that are defined as having the
Emoji
Unicode property.
- single-character emojis, that need the U+FE0F variation selector added, e.g. Red Heart. U+2764 alone (the heart) is what Unicode calls an “unqualified emoji”. It needs U+FE0F added for most platforms to be rendered as emoji.
- multi-codepoint emojis. These come in a lot of different shapes. Persons + gender + hair color, glued together with or without U+200D, families, regional flags, ...
And to top it all off, there is both an official ever-growing list of emojis and emojis, that are only supported by some platforms. (Example: Man Zombie: Light Skin Tone ) Unicode calls them non-RGI, not Recommended For General Interchange.
You need to decide, whether you want to support only Unicode-approved emojis or non-RGIs, too.
If you want only Unicode emojis, you could model your requirements with another second table, that you can periodically recreate from the official Unicode data. E.g., take this file:
https://github.com/unicode-org/unicodetools/blob/main/unicodetools/data/emoji/14.0/emoji-test.txt
(Note the “14.0” in the URL that you need to update with new Unicode Emoji versions!)
Take all lines with the text “fully-qualified”, in them everything up until the first colon, convert that from hex code points into a string and feed your helper table with that.
Example:
curl -sS https://raw.githubusercontent.com/unicode-org/unicodetools/main/unicodetools/data/emoji/14.0/emoji-test.txt | \
sed '/^\(#.*\)\?$/d' | \
sed -n '/fully-qualified/p' | \
sed 's/ *;.*//'
This gives you a long list of code points in hex format (e.g. 1F9D1 200D 1F52C
). You can feed them in a script, change them from hex to a string and put them in a small helper table:
CREATE TABLE unicode_emojis (
emoji TEXT PRIMARY KEY
);
Then in your other queries make sure, that values that go into your original table are in the unicode_emojis
table, too.