2

Hi I am trying to create regex expression and I am running into problems.

Basically this is what I have:

(.+?)(and|or)(.+?)

What I am looking for example:

(user email ends with "@email.com" and user name is "John") or (user email ends with "@domain.com" and user name is "Bob")

And my expected result that I would like to have is:

(user email ends with "@email.com" and user name is "John")
(user email ends with "@domain.com" and user name is "Bob")

Basically the OR will split based on the "()" which is optional so I can have something like this

user email ends with "@email.com" and user name is "John"

Which I am expecting to be like this :

user email ends with "@email.com"
user name is "John"

if I have to have more than one regex I'm find with that, The end goal is something like this for the above:

Array
(
    [0] => Array
        (
            [0] => user email ends with "@email.com"
            [1] => user name is "John"
        )

    [1] => Array
        (
            [0] => user email ends with "@domain.com"
            [1] => user name is "Bob"
        )
)

If it is something like this

(user email ends with "@email.com" and user name is "John") or ((user email ends with "@domain.com" and user name is "Bob") or (user id is 5))

Then i'd expect something like this

Array
(
    [0] => Array
        (
            [0] => user email ends with "@email.com"
            [1] => user name is "John"
        )

    [1] => Array
        (
            [0] => user email ends with "@domain.com"
            [1] => user name is "Bob"
            [2] => Array
                (
                    [0] => user id is 5
                )
        )
)

Any help will be much appreciated!

Benjamin
  • 429
  • 4
  • 17

2 Answers2

3

Here's a recursive function which also uses a recursive regex, let's recurse!!!

$text = '(user email ends with "@email.com" and user name is "John") or ((user email ends with "@domain.com" and user name is "Bob") or (user id is 5))';

print_r(buildtree($text));

function buildtree($input){
    $regex = <<<'regex'
    ~(                      # Group 1
        \(                  # Opening bracket
            (               # Group 2
                (?:         # Non-capturing group
                    [^()]   # Match anything except opening or closing parentheses
                |           # Or
                    (?1)    # Repeat Group 1
                )*          # Repeat the non-capturing group zero or more times
            )               # End of group 2
        \)                  # Closing bracket
    )~x
regex;
// The x modifier is for this nice and fancy spacing/formatting
    $output = [];

    if(preg_match_all($regex, $input, $m)){ // If there is a match
        foreach($m[2] as $expression){  // We loop through it
            $output[] = buildtree($expression); // And build the tree, recursive !!!
        }
        return $output;
    }else{  // If there is no match, we just split
        return preg_split('~\s*(?:or|and)\s*~i', $input);
    }
}

Output:

Array
(
    [0] => Array
        (
            [0] => user email ends with "@email.com"
            [1] => user name is "John"
        )

    [1] => Array
        (
            [0] => Array
                (
                    [0] => user email ends with "@domain.com"
                    [1] => user name is "Bob"
                )

            [1] => Array
                (
                    [0] => user id is 5
                )

        )

)

Online php demo Online regex demo

HamZa
  • 14,671
  • 11
  • 54
  • 75
  • Wow this is more like it! Thank you. I tried it with both without parenthesis and with parenthesis. It works flawlessly. Now is it possible that I can determine whether what group was actually an "OR" or not? – Benjamin Apr 29 '14 at 23:16
  • I think I just realized that, each group is an 'and' group. I can make it so that if there are multiple arrays they are already in their or groups. IE Array 0, and Array 1 are OR , and inside of Array 1, has another OR so that is two OR. I can figure out what they are by this. Group 0 = And, Group 1 = Or, so if it is in Array 0, they are ANDs, and Array 1 means they are OR group. Only problem would be if I have no parenthesis – Benjamin Apr 29 '14 at 23:26
  • @Benjamin Try with `return preg_split('~\s*(or|and)\s*~i', $input, -1, PREG_SPLIT_DELIM_CAPTURE);` in the `else` section. Not sure if that's the output you want. But please, next time provide your actual and desired expected output. It's quite a pain to change my answer several times. [See this meta thread about “chameleon questions”](http://meta.stackexchange.com/questions/43478/exit-strategies-for-chameleon-questions) – HamZa Apr 29 '14 at 23:27
  • 1
    Thanks, I'll ask more desired output next time. I appreciate it HamZa. That will work, I can use this to help determine what is and and what is or. Thanks! – Benjamin Apr 29 '14 at 23:35
  • $regex = <<<'regex' ~( # Group 1 \( # Opening bracket ( # Group 2 (?: # Non-capturing group [^()] # Match anything except opening or closing parentheses | # Or (?1) # Repeat Group 1 )* # Repeat the non-capturing group zero or more times ) # End of group 2 \) # Closing bracket )~x regex; i couldnt use this getting syntax error – US-1234 May 22 '14 at 11:06
  • @Manadh Could you paste the "non-working code" on [3v4l.org](http://3v4l.org)? Maybe you're using a pre-historic php? – HamZa Jul 12 '14 at 18:52
2

Try this, but I used explode() method in PHP and not Regex.
The input should have every user on a line, and without parenthesis (you can modify the code to remove the parenthesis), but at least this is the concept.

<?php

$string = 'user email ends with "@domain.com" and user name is "Bob" or user id is 5
user email ends with "@domain.com" and user name is "Bob" or user id is 5';

echo"<pre>";

//final array
$result = array();

//separate each line as element in an array
$line = explode("\n",$string);

//iterate through each line
for($k=0; $k<count($line);$k++){

    //find the and first and separate 
    $and = explode("and",$line[$k]);
    $and_size = count($and);

    //find the or in each separted and
    for($i=0;$i<$and_size;$i++){
        $or = explode("or",$and[$i]);
        //place found ors in a new subarray
        if(count($or) > 1){
            $and[] = array();
            $lastId = count($and)-1;
            for($j=1; $j<count($or);$j++){
                $and[$lastId][] = $or[$j];
            }
        }
    }
    $result[] = $and;
}



print_r($result);

Output:

Array
(
    [0] => Array
        (
            [0] => user email ends with "@domain.com" 
            [1] =>  user name is "Bob" or user id is 5
            [2] => Array
                (
                    [0] =>  user id is 5
                )

        )

    [1] => Array
        (
            [0] => user email ends with "@domain.com" 
            [1] =>  user name is "Bob" or user id is 5
            [2] => Array
                (
                    [0] =>  user id is 5
                )

        )

)
CMPS
  • 7,733
  • 4
  • 28
  • 53
  • This is close, but the problem is I need to have both parenthesis and not. Also it needs to be recursive, so another words, if I have a lot of parenthesis grouped, then I need it to break it up into each group accordingly. I was thinking of splitting the groups up then doing that function above but I tried it and it doesn't give me what I needed. – Benjamin Apr 29 '14 at 22:48