You can speed up matching by reducing the number of regular expressions you have, the complexity of the expression and the input size.
For instance for your example: '#<div id="above-related".*?</div>#s'
You can reduce the size of the input by using strpos
and substr
:
$input = "<html>..</html>";
$offset = 0;
while ($start = strpos('<div id="above-related"', $input, $offset)) {
$end = strpos("</div>", $input, $start);
$substr = substr($input, $start, $end); // take the small slice
$result = preg_replace('#<div id="above-related".*?</div>#s', '', $substr);
// stitch the input back together:
$input = substr($input, 0, $start) . $result . substr($input, $end);
$offset = $start + 1; // continue looking for more matches
}
In the case of your example the replacement doesn't actually use a match so it can be a straight up cut:
$input = "<html>..</html>";
$offset = 0;
$match_start = '<div id="above-related"';
$match_end = '</div>';
while ($start = strpos($match_start, $input, $offset)) {
$end = strpos($match_end, $input, $start);
$input = substr($input, 0, $start + strlen($match_start)) . substr($input, $end);
$offset = $start + 1; // continue looking for more matches
}
The trick here is that strpos
and substr
are much faster than preg_replace
(easily 100x).
If you can find a non-regular expression match, or maybe even a non-regular expression replacement strategy for each rule then you're going to see a significant speed up.