The marked duplicate answer is also wrong for the same reason that @bubblebobble's comment is wrong. You cannot simply reverse the order of individual code points and expect a sane string to come out the other side.
The intl library provides a sane method around this via IntlBreakIterator::createCharacterInstance()
which interprets coherent sequences of code points:
function utf8_strrev($input) {
$it = IntlBreakIterator::createCharacterInstance('he_IL.utf8');
$it->setText($input);
$ret = '';
$prev = 0;
foreach ($it as $pos) {
$ret = substr($input, $prev, $pos - $prev) . $ret;
$prev = $pos;
}
return $ret;
}
function naieve_utf8_strrev($input) {
return implode("", array_reverse(preg_split('//u', $input)));
}
$tests = [
"test",
"סבטלנה ואסילבנה",
"nai\xcc\x88ve fail"
];
foreach($tests as $test) {
var_dump(
$test,
naieve_utf8_strrev($test),
utf8_strrev($test)
);
echo PHP_EOL;
}
Output:
string(4) "test"
string(4) "tset"
string(4) "tset"
string(29) "סבטלנה ואסילבנה"
string(29) "הנבליסאו הנלטבס"
string(29) "הנבליסאו הנלטבס"
string(12) "naïve fail"
string(12) "liaf ev̈ian"
string(12) "liaf evïan"
and I still think that trying to reverse a hebrew string like this is the wrong way to to go if all you want is a left-to-right display of hebrew text. You should be using UTF8 LRO/RLO and PDF marks to switch the direction.
Edit: Finally tracked down the correct codepoints.
function utf8_force_ltr($input) {
$LRO = "\xe2\x80\xad"; // left-right override
$PDF = "\xe2\x80\xac"; // pop directional formatting
return $LRO . $input . $PDF;
}
var_dump($test, utf8_force_ltr($test));
Output:
string(29) "סבטלנה ואסילבנה"
string(35) "סבטלנה ואסילבנה"