6

I'm trying to split a string with binary into an array of repeated characters.

For example, an array of 10001101 split with this function would be:

    $arr[0] = '1';
    $arr[1] = '000';
    $arr[2] = '11';
    $arr[3] = '0';
    $arr[4] = '1';

(I tried to make myself clear, but if you still don't understand, my question is the same as this one but for PHP, not Python)

Community
  • 1
  • 1
R__
  • 183
  • 2
  • 11
  • 1
    Try using https://github.com/CHH/itertools/blob/master/lib/itertools.php its the same tool ported from python to php from which you have referenced. – sodhancha Oct 18 '15 at 10:46

3 Answers3

2
<?php
$s = '10001101';
preg_match_all('/((.)\2*)/',$s,$m);
print_r($m[0]);
/*
Array
(
    [0] => 1
    [1] => 000
    [2] => 11
    [3] => 0
    [4] => 1
)
*/
?>

Matches repeated character sequences of 1 or more. The regex stores the subject character into the second capture group ((.), stored as $m[1]), while the first capture group contains the entire repeat sequence (((.)\2*), stored as $m[0]). With preg_match_all, it does this globally over the entire string. This can be applied for any string, e.g. 'aabbccddee'. If you want to limit to just 0 and 1, then use [01] instead of . in the second capture group.

Keep in mind $m may be empty, to first check if the result exists, i.e. isset($m[0]), before you use it.

zamnuts
  • 9,492
  • 3
  • 39
  • 46
0

I'm thinking something like this. The code id not tested, I wrote it directly in the comment, so it might have some errors, you can adjust it.

$chunks = array();
$index = 0;
$chunks[$index] = $arr[0];
for($i = 1; $i < sizeof($arr) - 1; $i++) {
  if( $arr[$i] == $arr[$i-1] ) {
    $chunks[$index] .= $arr[$i];
  } else {
    $index++;
    $chunks[$index] = $arr[$i];
  }
}
sticksu
  • 3,660
  • 3
  • 23
  • 41
0

I wouldn't bother looking for the end-of-string in the pattern.

Most succinctly, capture the first occurring character then allow zero or more repetitions of the captured character, then restart the fullstring match with \K so that no characters are lost in the explosions.

Code: (Demo)

var_export(
    preg_split('~(.)\1*\K~', '10001101', 0, PREG_SPLIT_NO_EMPTY)
);

Output:

array (
  0 => '1',
  1 => '000',
  2 => '11',
  3 => '0',
  4 => '1',
)

If you don't care for regular expressions, here is a way of iterating through each character, comparing it to the previous one and conditionally concatenating repeated characters to a reference variable.

Code: (Demo) ...same result as first snippet

$array = [];
$lastChar = null;
foreach (str_split('10001101') as $char) {
    if ($char !== $lastChar) {
        unset($ref);
        $array[] = &$ref;
        $ref = $char;
        $lastChar = $char;
    } else {
        $ref .= $char;
    }
}
var_export($array);
mickmackusa
  • 43,625
  • 12
  • 83
  • 136
  • @R__ I see that you've been online since I posted my answer. Is there any chance of you accepting my answer so that researchers can more easily find a refined solution? – mickmackusa Jul 22 '21 at 00:46