Two regex scans of the string with preg_split()
: (Demo)
$digits = preg_split("/\D+/", $str, 0, PREG_SPLIT_NO_EMPTY);
$symbols = preg_split("/\d+/", $str, 0, PREG_SPLIT_NO_EMPTY);
One regex scan of the string with preg_split()
: (Demo)
foreach (preg_split("/(\d+)/", $str, 0, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE) as $string) {
if (ctype_digit($string)) {
$digits[] = $string;
} else {
$symbols[] = $string;
}
}
One regex scan of the string with preg_match_all()
: (Demo)
preg_match_all("/(\d+)|(\D+)/", $str, $matches, PREG_SET_ORDER);
foreach ($matches as $match) {
if ($match[1] !== '') {
$digits[] = $match[1];
} else {
$symbols[] = $match[2];
}
}
In most cases, the data volume is so small that efficiency is not worth benchmarking. If this is true for your project, then choose the technique that you find easiest to read and maintain.
Explanations
A regex-based solution is the only sensibly efficient/direct way to extract the desired substrings from the input string.
We can only assume that your input string is only comprised of digits and symbols (no other characters). If there are other characters, then the patterns will need adjustment.
With capturing the digits, there are four basic patterns to choose from: \d+
, [^\D]
, [0-9]+
, [^-+*/]+
All four patterns take an equal amount of "steps", so the deciding factor is pattern brevity and \d+
wins.
With capturing the symbols, there are four basic patterns: \D
, [^\d]
, [^0-9]
,[-+/*]
All four patterns take an equal amount of "steps", so again the deciding factor is pattern brevity and \D
wins.
For best testing, I will be using $str = "50+16-0*387/2+49";
as my input as it includes a zero and multi-digit integers.
A critique of Sahil's answer for the benefit of researchers who may be biased by the vote count:
[+-\/\*]
is teaching bad character class construction. This pattern is saying match any of the 6 following listed characters (but only 4 symbols are required for this task):
- any character from hyphen to forward slash (
+
, ,
, -
, .
, /
)
- a literal asterisk (which does not need to be escaped)
In other words, the hyphen in the character class is not treated literally, it is creating a range of acceptable characters.
Using array_map()
but ignoring its return value indicates an inappropriate use of the function. It would be more sensible to use array_walk()
but the necessary syntax is more verbose and harder to read. array_filter()
or array_reduce()
could also be used to return the filtered arrays to their appropriate result variable, but this would mean performing two loops over the same data. A foreach()
(as a single pass) is going to be the cleanest way to populate the two result arrays.
The preg_split()
calls are vulnerable to generating empty element because there is no PREG_SPLIT_NO_EMPTY
flag used.