Although regex is not famous for being fast, I'd like to know how well preg_grep()
can perform if the pattern is boiled down to its leanest form and only called once (not in a loop).
By removing longer masks which are covered by shorter masks, the pattern will be greatly reduced. How much will the reduction be? of course, I cannot say for sure, but with 17,000 masks, there are sure to be a fair amount of redundancy.
Code: (Demo)
$masks = ['1224*', '543*', '321*', '12245*', '5*', '122488*'];
sort($masks);
$needle = rtrim(array_shift($masks), '*');
$keep[] = $needle;
foreach ($masks as $mask) {
if (strpos($mask, $needle) !== 0) {
$needle = rtrim($mask, '*');
$keep[] = $needle;
}
}
// now $keep only contains: ['1224', '321', '5']
$numbers = ['122456789', '123456788', '321876543234567', '55555555555555555', '987654321'];
var_export(
preg_grep('~^(?:' . implode('|', $keep) . ')~', $numbers)
);
Output:
array (
0 => '122456789',
2 => '321876543234567',
3 => '55555555555555555',
)