5

Say I wanted to print a ÿ (latin small y with diaeresis) from its Unicode/UTF-8 number of U+00FF or hex of c3 bf. How can I do that in PHP?

The reason is that I need to be able to create certain UTF-8 Characters is for testing in my regex and string functions. However, since I have less than 200 keys on my keyboard I can't type them - and since many times I am stuck in an ASCII only world - I need to be able to create them bases solely off of their ASCII safe, UTF-8 character code.

Note: In order for it show correctly in a browser I know that the first step is

header('Content-Type: text/html; charset=utf-8');
Pillsy
  • 9,781
  • 1
  • 43
  • 70
Xeoncross
  • 55,620
  • 80
  • 262
  • 364
  • For the record, the related question with the accepted answer, actually also answers (better) this one. http://stackoverflow.com/questions/2748956/how-would-you-create-a-string-of-all-utf-8-characters-php – leonbloy May 02 '10 at 11:37
  • Do you mean, the unicode codepoint of U+00FF which is represented by the UTF-8 byte sequence of `c3 bf`? Sorry, but I was a bit confused. – jgivoni Feb 23 '12 at 22:47

2 Answers2

8

well you have everything you need.
Hex values being recognized in double-quoted strings as well

echo "\xc3\xbf";
Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
  • 1
    There's half the problem solved. I was not aware of the "\x..." trick. But what about the `U+00FF` number - how can you represent that in PHP - *or can you?* – Xeoncross May 01 '10 at 04:17
  • I wonder if you can compose the hex value from a decimal value like `print "\x". 191;`... – Xeoncross May 01 '10 at 04:20
  • 1
    @Xeon base conversion is very simple task, can be accomplished by any beginner programmer manually. there is also some function in PHP I believe, as well as in any other language. to recode U+00FF is also possible, and you have the function aready. Or this one http://stackoverflow.com/questions/1140660/how-to-get-uxxxx-to-display-correctly-using-php5 Anyway to ask only a half of your problem isn't too good practice. – Your Common Sense May 01 '10 at 04:35
  • I didn't ask half - I asked for both parts. However, I'm not sure there is a valid answer for first type of unicode number conversion in which case your answer is 100% correct. – Xeoncross May 01 '10 at 04:38
  • @Xeon, yeah, my bad, you asked both. sorry – Your Common Sense May 01 '10 at 04:48
  • How about just writing the codepoint as an html entity and use this: `html_entity_decode('ÿ', ENT_IGNORE, 'utf-8');`? – jgivoni Feb 23 '12 at 23:13
  • Note that the UTF-8 sequence is very different from the unicode codepoint and it's worth knowning the difference. – Evert Mar 04 '17 at 17:44
1

Solution 1 with a small pack function

<?php

function chr_utf8($n,$f='C*'){
return $n<(1<<7)?chr($n):($n<1<<11?pack($f,192|$n>>6,1<<7|191&$n):
($n<(1<<16)?pack($f,224|$n>>12,1<<7|63&$n>>6,1<<7|63&$n):
($n<(1<<20|1<<16)?pack($f,240|$n>>18,1<<7|63&$n>>12,1<<7|63&$n>>6,1<<7|63&$n):'')));
}

echo chr_utf8(9405).chr_utf8(9402).chr_utf8(9409).chr_utf8(9409).chr_utf8(9412);

//Output ⒽⒺⓁⓁⓄ

Check it in https://eval.in/748062

Solution 2 with json_decode

<?php

$utf8_char='["';
for($number=0;$number<55296;$number++)
$utf8_char.='\u'.substr('000'.strtoupper(dechex($number)),-4).'","';
$utf8_char=json_decode(substr($utf8_char,0,-2).']');

echo $utf8_char[9405].$utf8_char[9402].$utf8_char[9409].$utf8_char[9409].$utf8_char[9412];

//Output ⒽⒺⓁⓁⓄ
Php'Regex
  • 213
  • 3
  • 4