1

I have this function to convert emoji to unicode, but it's also converting text to hex.

How to only convert the emoji and keep text as plain text string?

function emoji_to_unicode($emoji) {
   $emoji = mb_convert_encoding($emoji, 'UTF-32', 'UTF-8');
   $unicode = strtoupper(preg_replace("/^[0]{3}/","U+",bin2hex($emoji)));
   return $unicode;
}

$var = ("xtext here");
$out = '';
for ($i = 0; $i < mb_strlen($var); $i++) {
    $out .= emoji_to_unicode(mb_substr($var, $i, 1));
}
echo "$out\n";

SO

$var = ("xtext here");

Returns to me:

U+1F600U+00078U+1F600U+00074U+00065U+00078U+00074U+00020U+00068U+00065U+00072U+00065

But I need return like this:

U+1F600xU+1F600text here

I need to keep text as plain text but also keep emoji in unicode format.

Otávio Barreto
  • 1,536
  • 3
  • 16
  • 35

1 Answers1

1

The Intl extension provides functions to work with unicode codepoints and blocks that will allow you to determine if the current character is an emoticon or not.

function emoji_to_unicode($emoji) {
   $emoji = mb_convert_encoding($emoji, 'UTF-32', 'UTF-8');
   $unicode = strtoupper(preg_replace("/^[0]{3}/","U+",bin2hex($emoji)));
   return $unicode;
}

$var = ("xtext here");
$out = '';
for ($i = 0; $i < mb_strlen($var); $i++) {
    $char = mb_substr($var, $i, 1);
    $isEmoji = IntlChar::getBlockCode(IntlChar::ord($char)) == IntlChar::BLOCK_CODE_EMOTICONS;
    $out .= $isEmoji ? emoji_to_unicode($char) : $char;
}

echo $out;

Here's the list of predefined constants where you can find all blocks.

msg
  • 7,863
  • 3
  • 14
  • 33