167

Is there a nice way to iterate on the characters of a string? I'd like to be able to do foreach, array_map, array_walk, array_filter etc. on the characters of a string.

Type casting/juggling didnt get me anywhere (put the whole string as one element of array), and the best solution I've found is simply using a for loop to construct the array. It feels like there should be something better. I mean, if you can index on it shouldn't you be able to iterate as well?

This is the best I've got

function stringToArray($s)
{
    $r = array();
    for($i=0; $i<strlen($s); $i++) 
         $r[$i] = $s[$i];
    return $r;
}

$s1 = "textasstringwoohoo";
$arr = stringToArray($s1); //$arr now has character array

$ascval = array_map('ord', $arr);  //so i can do stuff like this
$foreach ($arr as $curChar) {....}
$evenAsciiOnly = array_filter( function($x) {return ord($x) % 2 === 0;}, $arr);

Is there either:

A) A way to make the string iterable
B) A better way to build the character array from the string (and if so, how about the other direction?)

I feel like im missing something obvious here.

jon_darkstar
  • 16,398
  • 7
  • 29
  • 37
  • Maybe you should say more about that you're trying to accomplish... it seems like there might be a better way to do it using normal string operations. – Vinay Pai Jan 05 '11 at 05:20
  • 1
    dont have a real objective here. just a curiosity i was playing with. seemed weird that even though you can index on strings you cant iterate. i was at a loss to even think up meaningful example uses, but i still would like to know if there is some way to iterate on the strings characters without constructing a character array explictly – jon_darkstar Jan 05 '11 at 05:31
  • thats good point though, obviously my examples are pretty shallow. ie - mostly anything you'd do with `array_filter` in this sense could be better done with string or reg-ex functions – jon_darkstar Jan 05 '11 at 05:32
  • Solving https://projecteuler.net/problem=20 might be an example (though somewhat contrived) use case. – Nick Edwards Feb 06 '16 at 00:44
  • one note, regarding for($i=0; $i – Amin Jul 29 '17 at 11:24
  • String sanitation is a good example of when to use this. if you want to replace all occurrences of '%' with '[%]' you would just use str_replace. But if you want to replace all occurrences of '[' with '[[]' and all occurrences of ']' with '[]]' you would need to iterate through the string to test each character to prevent the replaces from clobbering each-other. – danielson317 Jan 15 '20 at 19:12

9 Answers9

239

Use str_split to iterate ASCII strings (since PHP 5.0)

If your string contains only ASCII (i.e. "English") characters, then use str_split.

$str = 'some text';
foreach (str_split($str) as $char) {
    var_dump($char);
}

Use mb_str_split to iterate Unicode strings (since PHP 7.4)

If your string might contain Unicode (i.e. "non-English") characters, then you must use mb_str_split.

$str = 'μυρτιὲς δὲν θὰ βρῶ';
foreach (mb_str_split($str) as $char) {
    var_dump($char);
}
emkey08
  • 5,059
  • 3
  • 33
  • 34
SeaBrightSystems
  • 2,554
  • 1
  • 15
  • 4
  • @jon_darkstar I don't know your application, but do take note that each entry in an array has a significant overhead (4bytes IIRC). Skip that, it is 'quite' way more: http://nikic.github.com/2011/12/12/How-big-are-PHP-arrays-really-Hint-BIG.html – Daan Timmer Nov 15 '12 at 08:27
  • 2
    `str_split() will split into bytes, rather than characters when dealing with a multi-byte encoded string.` - So `str_split` cannot work with Unicode – Happy May 23 '20 at 22:47
  • 2
    `mb_str_split` would be the multi-byte equivalent. `$array = mb_str_split($your_string);` – LStarky Nov 10 '20 at 22:30
  • Any reason why the loop isn't simplified to `foreach (str_split($your_string) as $char)`? – emkey08 Jul 01 '22 at 12:21
  • Pay attention that str_split() will produce at least one element even in case of empty strings, which, on your context will produce at least one iteration in that case. This may be a good source of tricky bugs. – Demis Palma ツ Jul 29 '22 at 09:00
  • @DemisPalmaツ True for **PHP before 8.2**. Since **PHP 8.2** this bug is fixed. See [PHP 8.2 upgrade notes](https://github.com/php/php-src/blob/php-8.2.0RC1/UPGRADING#L51). – emkey08 Sep 04 '22 at 09:30
126

Iterate string:

for ($i = 0; $i < strlen($str); $i++){
    echo $str[$i];
}
Owen
  • 7,347
  • 12
  • 54
  • 73
  • 10
    This seems like a better answer because it answers the question - i.e. how to iterate over a string as opposed to 'convert to array'. – Robin Andrews Jan 11 '17 at 08:34
  • 3
    LOL!!!!! Everything @OmarTariq. This is much more efficient than the answer provided. –  Oct 15 '18 at 03:10
  • 12
    Just note that you're calling `strlen()` on each iteration. Not a terrible thing, since PHP has the length precalculated, but still a function call. If you have a need for speed, better save that in a variable before starting the loop. – Vilx- Dec 18 '18 at 10:48
  • 5
    This is not good for multibyte strings, because here we're gettings byte offset, not a symbol – alvery May 26 '19 at 09:56
  • 6
    @OmarTariq *"This is the answer. What is wrong with the world?"* .... The wrong with the world is that the world has other languages than English, this function as alvery said will iterate the bytes in the string, not the characters. – Accountant م Sep 01 '19 at 13:08
  • The fun thing is $string[-1] will return you 'g'. I thought it should have returned some index not found error. It's not just weird but a blunder in PHP (IMO) – Madhab452 Mar 30 '20 at 10:25
  • I tried it on a string containing among others an utf8 character and it did not work : it seems to iterate over the bytes of the string instead of the characters of the string. – Arnaud Jul 26 '22 at 14:57
21

If your strings are in Unicode you should use preg_split with /u modifier

From comments in php documentation:

function mb_str_split( $string ) { 
    # Split at all position not after the start: ^ 
    # and not before the end: $ 
    return preg_split('/(?<!^)(?!$)/u', $string ); 
} 
Dawid Ohia
  • 16,129
  • 24
  • 81
  • 95
  • 3
    For multibyte strings, `mb_split` is more reliable. – Lux Jul 16 '17 at 20:07
  • Citation required @Lux – mickmackusa Oct 16 '21 at 07:53
  • @mickmackusa It's been a couple years (and these days you should probably be using the stdlib `mb_str_split` if you're on PHP≥7.4 anyway), and I can't really recall what I meant there, but my guess would be that preg_split with `/.../u` is UTF-8 only (NOT 'Unicode', as OP says) while `mb_split` allows for arbitrary encoding (additionally, `mb_split` is explicitly designed for regex-splitting over multibyte strings so it might have some extra optimizations and such? and in general since it's purpose-built my default assumption is that it's more reliable and/or complete than a /u PCRE extension) – Lux Oct 18 '21 at 04:41
  • I am not personally aware of any differences between `mb_str_split()` and `preg_split('//u', $string)`. I am just saying that it is important that we not perpetuate potentially false claims based on assumptions. If one technique is provably inferior to another, we should be able to substantiate this truth. – mickmackusa Oct 18 '21 at 04:46
  • Ye! thanks for calling me out on that. Unfortunately it's a bit too late for me to edit the original comment but hopefully the follow up clears up what I meant; info from [here](https://www.php.net/manual/en/reference.pcre.pattern.modifiers.php) and [here](https://www.php.net/manual/en/function.mb-split.php) btw since I hit charlimit on the previous comment. – Lux Oct 18 '21 at 04:50
14

You can also just access $s1 like an array, if you only need to access it:

$s1 = "hello world";
echo $s1[0]; // -> h
Moritur
  • 1,651
  • 1
  • 18
  • 31
8

For those who are looking for the fastest way to iterate over strings in php, Ive prepared a benchmark testing.
The first method in which you access string characters directly by specifying its position in brackets and treating string like an array:

$string = "a sample string for testing";
$char = $string[4] // equals to m

I myself thought the latter is the fastest method, but I was wrong.
As with the second method (which is used in the accepted answer):

$string = "a sample string for testing";
$string = str_split($string);
$char = $string[4] // equals to m

This method is going to be faster cause we are using a real array and not assuming one to be an array.

Calling the last line of each of the above methods for 1000000 times lead to these benchmarking results:

Using string[i]
0.24960017204285 Seconds

Using str_split
0.18720006942749 Seconds

Which means the second method is way faster.

8

Most of the answers forgot about non English characters !!!

strlen counts BYTES, not characters, that is why it is and it's sibling functions works fine with English characters, because English characters are stored in 1 byte in both UTF-8 and ASCII encodings, you need to use the multibyte string functions mb_*

This will work with any character encoded in UTF-8

// 8 characters in 12 bytes
$string = "abcdأبتث";

$charsCount = mb_strlen($string, 'UTF-8');
for($i = 0; $i < $charsCount; $i++){
    $char = mb_substr($string, $i, 1, 'UTF-8');
    var_dump($char);
}

This outputs

string(1) "a"
string(1) "b"
string(1) "c"
string(1) "d"
string(2) "أ"
string(2) "ب"
string(2) "ت"
string(2) "ث"
Accountant م
  • 6,975
  • 3
  • 41
  • 61
6

Expanded from @SeaBrightSystems answer, you could try this:

$s1 = "textasstringwoohoo";
$arr = str_split($s1); //$arr now has character array
Dairy Window
  • 1,307
  • 1
  • 13
  • 9
  • I disagree, this answer does add value, it gives a working example of how str_split might work in a PHP application. @SeaBrightSystems just links to the documentation, which is sometimes not that helpful when a person is trying to see how a function may work, given an example. Otherwise most SO answers would just be links to php.net – kurdtpage Aug 16 '16 at 21:50
5

Hmm... There's no need to complicate things. The basics work great always.

    $string = 'abcdef';
    $len = strlen( $string );
    $x = 0;

Forward Direction:

while ( $len > $x ) echo $string[ $x++ ];

Outputs: abcdef

Reverse Direction:

while ( $len ) echo $string[ --$len ];

Outputs: fedcba

Ali
  • 2,702
  • 3
  • 32
  • 54
Ash
  • 71
  • 2
  • 7
3
// Unicode Codepoint Escape Syntax in PHP 7.0
$str = "cat!\u{1F431}";

// IIFE (Immediately Invoked Function Expression) in PHP 7.0
$gen = (function(string $str) {
    for ($i = 0, $len = mb_strlen($str); $i < $len; ++$i) {
        yield mb_substr($str, $i, 1);
    }
})($str);

var_dump(
    true === $gen instanceof Traversable,
    // PHP 7.1
    true === is_iterable($gen)
);

foreach ($gen as $char) {
    echo $char, PHP_EOL;
}
masakielastic
  • 4,540
  • 1
  • 39
  • 42