0

So I have this annonymous function that converts every character of my string into entity.

var myStr = myStr.replace(/[\u0022\u0027\u0080-\FFFF]/g, function(a) {
   return '&#' + a.charCodeAt(0) + ';';
});  

I need to do the same with PHP.
I'll have a normal string and transform it it's equivalent entity code.
e.g:

Have --> Want: Képzeld el PDF ------->Képzeld el PDF

I was reading about preg_replace_callback

Perform a regular expression search and replace using a callback

But I don't know how to apply the same thing in PHP.
I could also use the annonymous function within preg_replace, like so:

 $line = preg_replace_callback(
        '/[\u0022\u0027\u0080-\FFFF]/g',
        function ($matches) {
            return '&#' + a.charCodeAt(0) + ';';
        },
    );

I couldn't make it work or find an equivalence for charCodeAt. Even the regex range of characters are not supported by preg_replace function.

Community
  • 1
  • 1
PlayHardGoPro
  • 2,791
  • 10
  • 51
  • 90
  • https://stackoverflow.com/questions/10333098/utf-8-safe-equivelant-of-ord-or-charcodeat-in-php – Felippe Duarte May 23 '18 at 17:59
  • https://stackoverflow.com/questions/2934563/how-to-decode-unicode-escape-sequences-like-u00ed-to-proper-utf-8-encoded-cha – Sammitch May 23 '18 at 18:00
  • There's also [`mb_ord()`](http://php.net/manual/en/function.mb-ord.php) and [`mb_chr()`](http://php.net/manual/en/function.mb-chr.php) and their associated [polyfill](https://packagist.org/packages/symfony/polyfill-mbstring) for PHP<7.2. – Sammitch May 23 '18 at 18:10
  • 1
    `return '' . a.charCodeAt($matches[0]) + ';';` –  May 23 '18 at 18:23
  • Possible duplicate of [How to decode Unicode escape sequences like "\u00ed" to proper UTF-8 encoded characters?](https://stackoverflow.com/questions/2934563/how-to-decode-unicode-escape-sequences-like-u00ed-to-proper-utf-8-encoded-cha) – wp78de May 23 '18 at 18:25
  • php does NOT have `charCodeAt`. Also, the question is not about this function equivalent only. – PlayHardGoPro May 23 '18 at 19:49
  • @Sammitch Thanks for the suggestion topic, but it does not apply here. The output is the characters itself I need the entity code. – PlayHardGoPro May 23 '18 at 19:52

1 Answers1

1

You could use IntlChar::ord() to find the codepoint of a character. Below is a transpiled version:

$myStr = preg_replace_callback('~[\x{0022}\x{0027}\x{0080}-\x{ffff}]~u', function ($c) {
    return '&#' . IntlChar::ord($c[0]) . ';';
}, $myStr);

See live demo

revo
  • 47,783
  • 14
  • 74
  • 117