I've toiled at this task for a period of time, trying to devise a method to combine your fullstring validation with indefinite captured groups. After trying many combinations of \G
and lookarounds, I am afraid it cannot be done in one pass. If php allowed variable width lookbehinds, I think I could, but alas they are not available.
What I can offer is a process with the unnecessary "stuff" removed.
Code: (Demo)
$strings = ["(AAA/)(BBB/)(cc)", "(AAA/)xxxx(BBB/)(cc)"];
foreach ($strings as $string) {
if (!preg_match('~^(?:\([\w\\/-]+\))+$~', $string)) {
echo "The simple pattern $string is not valid!";
// throw new \Exception("The simple pattern $string is not valid!");
} else {
var_export(preg_split('~\)\K~', $string, 0, PREG_SPLIT_NO_EMPTY));
}
echo "\n";
}
Output:
array (
0 => '(AAA/)',
1 => '(BBB/)',
2 => '(cc)',
)
The simple pattern (AAA/)xxxx(BBB/)(cc) is not valid!
Pattern #1 Breakdown:
~ #pattern delimiter
^ #start of string anchor
(?: #start of non-capturing group
\( #match one opening parenthesis
[\w\\/-]+ #greedily match one or more of the following characters: a-z, A-Z, 0-9, underscores, backslashes, slashes, and hyphens
\) #match one closing parenthesis
) #end of non-capturing group
+ #allow one or more occurrences of the non-capturing group
$ #end of string anchor
~ #pattern delimiter
Pattern #2 Breakdown:
~ #pattern delimiter
\) #match one closing parenthesis
\K #restart the fullstring match (forget/release previously matched character(s))
~ #pattern delimiter
Pattern #2's effect is to locate every closing parenthesis and "explode" the string on the zero width position that follows the closing parenthesis. \K
ensures that no characters become casualties in the explosions.
The if
condition does not need to call preg_match_all()
since there can only ever be one matching string while you are validating from ^
to $
. Declaring a variable to contain the "match" is pointless ( as is PREG_OFFSET_CAPTURE
) -- if there is a match, it will be the entire input string so just use that value if you want it.
preg_split()
is a suitable substitute for a preg_match_all()
call because it outputs exactly the output that you will seek in a lean single-dimensional array AND uses a very small, readable pattern.
*The 3rd and 4th parameters: 0
and PREG_SPLIT_NO_EMPTY
tell the function respectively that there is "no limit" to the number of explosions, and that any empty elements should be discarded (don't make an empty element from the )
that trails cc
)