I have the byte position of a character in an utf-8 string (got it via preg_match and PREG_OFFSET_CAPTURE). But I need the character position. How can I get it?
Here is an example:
I have something like this:
$x = 'öüä nice world';
preg_match('/nice/u', $x, $m, PREG_OFFSET_CAPTURE);
var_dump($m);
which results in:
array(1) {
[0]=>
array(2) {
[0]=>
string(4) "nice"
[1]=>
int(7)
}
}
So I have the byte position which is 7.
But I need the character position which is 4. Is there a way to convert the byte position to the character position?
This example is highly simplified. It's not an option for me to just use mb_strpos
or such things to find the position of the word "nice". I need the regular expression and actually I need preg_match_all
instead of preg_match
. So I think to convert the position would be the best way for me.