Approaches using built-in standard string functions or array bracket character position reference will run into trouble if your string contains multibyte characters. There's a very simple one-line-way to do this swap for multibyte (or any) strings – without multiple function calls or temporary variables. You can use preg_replace
with the u
Unicode flag:
function swap_first_last(string $str): string
{
return preg_replace('~^(.)(.*)(.)$~u', '\3\2\1', $str);
}
echo swap_first_last('foobar'); // (foobar > roobaf)
echo swap_first_last('फोबर'); // रोबफ (phobara > robapha)
In the regular expression, ^
stands for start of subject, then (.)
captures the first character, (.*)
captures 0 or more characters in between, (.)
captures the final character, with $
for end of subject. Finally, with '\3\2\1'
, we simply reverse the first and last capture groups.
On using bracket string access $str[0]
and $str[-1]
, from the manual: "Internally, PHP strings are byte arrays. As a result, accessing or modifying a string using array brackets is not multi-byte safe, and should only be done with strings that are in a single-byte encoding such as ISO-8859-1."
On using built-in standard string functions, from the manual: "If you apply a non-multibyte-aware string function to the string, it probably fails to detect the beginning or ending of the multibyte character and ends up with a corrupted garbage string that most likely loses its original meaning."
Therefore, for a potential multibyte string, you should use multibyte string functions such as mb_substr()
, mb_strlen()
and mb_str_split()
, depending on your approach. I personally find the preg_replace()
solution much simpler and more readable.