0

I use the following function to find the nth character in a string which works well. However there is one exception, lets say its a comma for this purpose, what i need to alter about this is that if the coma is within ( and ) then it shouldnt count that

function strposnth($haystack, $needle, $nth=1, $insenstive=0)
{
   //if its case insenstive, convert strings into lower case
   if ($insenstive) {
       $haystack=strtolower($haystack);
       $needle=strtolower($needle);
   }
   //count number of occurances
   $count=substr_count($haystack,$needle);

   //first check if the needle exists in the haystack, return false if it does not
   //also check if asked nth is within the count, return false if it doesnt
   if ($count<1 || $nth > $count) return false;


   //run a loop to nth number of occurrence
   //start $pos from -1, cause we are adding 1 into it while searching
   //so the very first iteration will be 0
   for($i=0,$pos=0,$len=0;$i<$nth;$i++)
   {   
       //get the position of needle in haystack
       //provide starting point 0 for first time ($pos=0, $len=0)
       //provide starting point as position + length of needle for next time
       $pos=strpos($haystack,$needle,$pos+$len);

       //check the length of needle to specify in strpos
       //do this only first time
       if ($i==0) $len=strlen($needle);
     }

   //return the number
   return $pos;
}

So ive got the regex working that only captures the comma when outside of () which is: '/,(?=[^)]*(?:[(]|$))/'

and you can see a live example working here: http://regex101.com/r/xE4jP8

but im not sure how to make it work within the strpos loop, i know what i need to do, tell it the needle has this regex exception but i am not sure how to make it work. Maybe i should ditch the function and use another method?

Just to mention my end result i want is to split the string after every 6 commas before the next string starts, example:

rttr,ertrret,ertret(yes,no),eteert,ert ert,rtrter,0 rttr,ert(yes,no)rret,ert ret,eteert,ertert,rtrter,1 rttr,ertrret,ert ret,eteert,ertert,rtrter,0 rttr,ertrret,ert ret,eteert,ertert,rtrter,2 rttr,ert(white,black)rret,ert ret,eteert,ertert,rtrter,0 rttr,ertrret,ert ret,eteert,ertert,rtrter,0 rttr,ertrret,ert ret,et(blue,green)eert,ertert,rtrter,1

Note that there is always a 1 digit number (1-3) and a space after the 6th comma before the next part of the string begins but i cant really rely on that as its possible earlier in the string this pattern could happen so i can always rely on the fact ill need to split the string after the first digit and space after the 6th comma. So i want to split the string directly after this.

For example the above string would be split like this:

rttr,ertrret,ertret(yes,no),eteert,ert ert,rtrter,0
rttr,ert(yes,no)rret,ert ret,eteert,ertert,rtrter,1
rttr,ertrret,ert ret,eteert,ertert,rtrter,0
rttr,ertrret,ert ret,eteert,ertert,rtrter,2 
rttr,ert(white,black)rret,ert ret,eteert,ertert,rtrter,0
rttr,ertrret,ert ret,eteert,ertert,rtrter,0
rttr,ertrret,ert ret,et(blue,green)eert,ertert,rtrter,1

I can do that myself pretty easily if i know how to get the position of the character then i can use substr to split it but an easier way might be preg_split but im not sure how that would work until i figure this part out

I hope i wasnt too confusing in explaining, i bet i was :)

user1547410
  • 863
  • 7
  • 27
  • 58
  • What do you want to do when the `nth` character *is* enclosed in brackets? – Amal Murali Jan 11 '14 at 14:34
  • ignore it, basically whats happening is there is always 7 values, each separated with a comma. However sometimes there is text submitted by a user which is stored inside the ( ), that maybe have a comma since i have no control over it so when im splitting it i need to ignore anything inside the () so i dont end splitting in the wrong place. The rest of the data never has a comma so i can be confident using this approach, hope that clears it up. So as you can see (yes,no) (black,white) etc are user input and they could mess up the split if i counted those commas. – user1547410 Jan 11 '14 at 14:39

1 Answers1

2

For these kind of nesting problems regex usually is not the right tool. However, when the problem is actually not that complicated, as yours seems to be, regex will do just fine.

Try this:

(?:^|,)((?:[^,(]*(?:\([^)]*\))?)*)
^ start the search with a comma or the start of the string
        ^ start non capture group
           ^ search until comma or open parenthesis
                 ^ if parenthesis found then capture until 
                           ^ end of parenthesis  
                                ^ end of capture group repeat if necessary

See it in action: http://regex101.com/r/eS0cX4

As you can see this will capture everything between the comma's outside of the parenthesis. If you get all these matches into an array using preg_match_all you can split it any which way you like.

Lodewijk Bogaards
  • 19,777
  • 3
  • 28
  • 52