26

There must be a fast and efficient way to split a (text) string at the "nth" occurrence of a needle, but I cannot find it. There is a fairly full set of functions in the strpos comments in the PHP manual, but that seems a bit much for what I need.

I have plain text as $string and want to split it at nth occurrence of $needle, and in my case, needle is simply a space. (I can do the sanity checks!)

How can I do it?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Dɑvïd
  • 1,929
  • 2
  • 16
  • 26
  • What does "needle" mean in this context? – Peter Mortensen Jul 04 '20 at 08:23
  • @PeterMortensen https://www.php.net/manual/en/function.strpos.php#102336 What you are looking for. Coming from the expression: [Searching for a needle in a haystack](https://www.languagecouncils.sg/goodenglish/resources/idioms/a-needle-in-a-haystack#:~:text=Meaning%3A,a%20needle%20in%20a%20haystack!) – online Thomas Apr 04 '22 at 14:38

13 Answers13

23

It could be:

function split2($string, $needle, $nth) {
    $max = strlen($string);
    $n = 0;
    for ($i=0; $i<$max; $i++) {
        if ($string[$i] == $needle) {
            $n++;
            if ($n >= $nth) {
                break;
            }
        }
    }
    $arr[] = substr($string, 0, $i);
    $arr[] = substr($string, $i+1, $max);

    return $arr;
}
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Galled
  • 4,146
  • 2
  • 28
  • 41
10

Personally I'd just split it into an array with explode, and then implode the first n-1 parts as the first half, and implode the remaining number as the second half.

Chris Eberle
  • 47,994
  • 12
  • 82
  • 119
10

If your needle will always be one character, use Galled's answer. It's going to be faster by quite a bit. If your $needle is a string, try this. It seems to work fine.

function splitn($string, $needle, $offset)
{
    $newString = $string;
    $totalPos = 0;
    $length = strlen($needle);
    for($i = 0; $i < $offset; $i++)
    {
        $pos = strpos($newString, $needle);

        // If you run out of string before you find all your needles
        if($pos === false)
            return false;
        $newString = substr($newString, $pos + $length);
        $totalPos += $pos + $length;
    }
    return array(substr($string, 0, $totalPos-$length), substr($string, $totalPos));
}
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
John
  • 590
  • 4
  • 13
  • as you note, my situation entails a single-character "needle", but it's good to have this solution in the thread. Thanks! – Dɑvïd May 10 '11 at 21:19
7

Here's an approach that I would prefer over a regexp solution (see my other answer):

function split_nth($str, $delim, $n)
{
  return array_map(function($p) use ($delim) {
      return implode($delim, $p);
  }, array_chunk(explode($delim, $str), $n));
}

Just call it by:

split_nth("1 2 3 4 5 6", " ", 2);

Output:

array(3) {
  [0]=>
  string(3) "1 2"
  [1]=>
  string(3) "3 4"
  [2]=>
  string(3) "5 6"
}
Matthew
  • 47,584
  • 11
  • 86
  • 98
  • 3
    This solves a slightly different problem, of course -- not "split AT *nth* character" but "split at EVERY *nth* character. Not quite my scenario! Might be useful for someone else, though. Thanks! – Dɑvïd May 10 '11 at 21:21
  • I've edited the answer to handle your point. But whether the edit is accepted or not I've combined it into a full answer [here](https://stackoverflow.com/a/60619720/4829915). – LWC Mar 10 '20 at 14:31
1

I've edited Galled's function to make it explode after every nth occurrences instead of just the first one.

function split2($string, $needle, $nth) {
  $max = strlen($string);
  $n = 0;
  $arr = array();

  //Loop trough each character
  for ($i = 0; $i < $max; $i++) {

    //if character == needle
    if ($string[$i] == $needle) {
      $n++;
      //Make a string for every n-th needle
      if ($n == $nth) {
        $arr[] = substr($string, $i-$nth, $i);
        $n=0; //reset n for next $nth
      }
      //Include last part of the string
      if(($i+$nth) >= $max) {
        $arr[] = substr($string, $i + 1, $max);
        break;
      }
    }
  }
  return $arr;
}
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Chris Visser
  • 1,607
  • 12
  • 24
1

Easily, just do it:

$i = $pos = 0;
do {
    $pos = strpos($string, $needle, $pos+1);
} while(++$i < $nth);
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
  • Thanks to @Vardkin to find my bug! I fixed it. – Hamze GhaemPanah Nov 05 '19 at 17:15
  • `do ... while` begs the question whether there is a situation it does not handle. Is there? – Peter Mortensen Jul 23 '20 at 20:50
  • [Vardkin's answer](https://stackoverflow.com/questions/5956066/how-can-i-split-a-string-in-php-at-the-nth-occurrence-of-a-needle/57328977#57328977) claims this does not work. You can [edit your answer](https://stackoverflow.com/posts/57239723/edit) (***without*** "Edit:", "Update:", or similar). – Peter Mortensen Jul 23 '20 at 20:53
1

I really like Hamze GhaemPanah's answer for its brevity. However, there's a small bug in it.

In the original code:

$i = $pos = 0;
do {
    $pos = strpos($string, $needle, $pos+1);
} while( $i++ < $nth);

$nth in the do while loop should be replaced with ($nth-1) as it will incorrectly iterate one extra time - setting the $pos to the position of the $nth+1 instance of the needle. Here's an example playground to demonstrate. If this link fails here is the code:

$nth = 2;
$string = "44 E conway ave west horse";
$needle = " ";

echo"======= ORIGINAL =======\n";

$i = $pos = 0;
do {
    $pos = strpos($string, $needle, $pos + 1);
} while( $i++ < $nth);

echo "position: $pos \n";
echo substr($string, 0, $pos) . "\n\n";

/*
    Outputs:

    ======= ORIGINAL =======
    position: 11
    44 E conway
*/

echo"======= FIXED =======\n";

$i = $pos = 0;
do {
    $pos = strpos($string, $needle, $pos + 1);
} while( $i++ < ($nth-1) );

echo "position: $pos \n";
echo substr($string, 0, $pos);

/*
    Outputs:

    ======= FIXED =======
    position: 4
    44 E

*/

That is, when searching for the position of the second instance of our needle, our loop iterates one extra time setting $pos to the position of our third instance of our needle. So, when we split the string on the second instance of our needle - as the OP asked - we get the incorrect substring.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Vardkin
  • 110
  • 3
  • 4
1

You can use something like the following:

/* Function copied from the PHP manual comment you referenced */
function strnripos_generic( $haystack, $needle, $nth, $offset, $insensitive, $reverse )
{
    //  If needle is not a string, it is converted to an integer and applied as the ordinal value of a character.
    if(! is_string($needle)) {
        $needle = chr((int)$needle);
    }

    //  Are the supplied values valid / reasonable?
    $len = strlen($needle);
    if(1 > $nth || 0 === $len) {
        return false;
    }

    if($insensitive) {
        $haystack = strtolower($haystack);
        $needle   = strtolower($needle  );
    }

    if($reverse) {
        $haystack = strrev($haystack);
        $needle   = strrev($needle  );
    }

    //  $offset is incremented in the call to strpos, so make sure that the first
    //  call starts at the right position by initially decreasing $offset by $len.
    $offset -= $len;
    do
    {
        $offset = strpos($haystack, $needle, $offset + $len);
    } while(--$nth && false !== $offset);

    return false === $offset || ! $reverse ? $offset : strlen($haystack) - $offset;
}

// Our split function
function mysplit ($haystack, $needle, $nth) {
    $position = strnripos_generic($haystack, $needle, $nth, 0, false, false);
    $retval = array();

    if ($position !== false) {
        $retval[0] = substr($haystack, 0, $position-1);
        $retval[1] = substr($haystack, $position);
        return $retval;
    }

    return false;
}

Then you just use the mysplit function, and you'll get an array with two substrings. First containing all characters up to the nth occurrence of the needle (not included), and second from the nth occurrence of the needle (included) to the end.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
roirodriguez
  • 1,685
  • 2
  • 17
  • 31
  • that certainly makes using these functions for splitting much more manageable. Galled's shorter solution works for me, but this could be useful for more complex situations than mine. Thanks! – Dɑvïd May 10 '11 at 21:18
1

This is ugly, but it seems to work:

$foo = '1 2 3 4 5 6 7 8 9 10 11 12 13 14';

$parts = preg_split('!([^ ]* [^ ]* [^ ]*) !', $foo, -1,
            PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);

var_dump($parts);

Output:

array(5) {
  [0]=>
  string(5) "1 2 3"
  [1]=>
  string(5) "4 5 6"
  [2]=>
  string(5) "7 8 9"
  [3]=>
  string(8) "10 11 12"
  [4]=>
  string(5) "13 14"
}

Replace the single spaces in the query with a single character you wish to split on. This expression won't work as-is with multiple characters as the delimiter.

This is hard coded for every third space. With a little tweaking, probably could be easily adjusted. Although a str_repeat to build a dynamic expression would work as well.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Matthew
  • 47,584
  • 11
  • 86
  • 98
1

Taking Matthew's answer and adding a solution for Dɑvïd's comment:

function split_nth($str, $delim, $n) {
  $result = array_map(function($p) use ($delim) {
      return implode($delim, $p);
  }, array_chunk(explode($delim, $str), $n));
  $result_before_split = array_shift($result);
  $result_after_split = implode(" ", $result); 
  return array($result_before_split, $result_after_split);
}

Just call it by:

list($split_before, $split_after) = split_nth("1 2 3 4 5 6", " ", 2);

Output:

1 2
3 4 5 6
LWC
  • 1,084
  • 1
  • 10
  • 28
1

Use a pattern of zero or more non-delimiter characters followed by the delimiting character, reset the fullstring match before the delimiter, set the quantifier of the expression to the desired n.

Code: (Demo)

$str = 'There must be a fast and efficient way to split a (text) string at the "nth" occurrence of a needle.';

var_export(
     preg_split('/([^ ]*\K ){2}/', $str)
);

To set a hard limit on the maximum number of elements generated, declare the 3rd parameter. 2 will perform only one split and produce 2 elements. Demo

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
0
function strposnth($haystack,$needle,$n){
  $offset = 0;
  for($i=1;$i<=$n;$i++){
    $indx = strpos($haystack, $needle, $offset);
    if($i == $n || $indx === false)
        return $indx;
    else {
        $offset = $indx+1;
    }
  }
  return false;
}
tinkerr
  • 975
  • 2
  • 14
  • 32
0
function split_nth($haystack, $needle, $nth){
    $result = array();
    if(substr_count($haystack,$needle) > ($nth-1)){
        $haystack = explode($needle, $haystack);
        $result[] = implode(array_splice($haystack, 0, $nth), $needle);
        $result[] = implode($haystack, $needle);
    }
    return $result;
}
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ignacio
  • 1
  • 1