I have a regular expression that looks for email addresses ( this was taken from another SO post that I can't find and has been tested on all kinds of email configurations ... changing this is not exactly my question ... but understand if that is the root cause ):
/[a-z0-9_\-\+]+@[a-z0-9\-]+\.([a-z]{2,3})(?:\.[a-z]{2})?/i
I'm using preg_match_all() in PHP.
This works great for 99.99...% of files I'm looking in and takes around 5ms, but occasionally takes a couple minutes. These files are larger than the average webpage at around 300k, but much larger files generally process fine. The only thing I can find in the file contents that stands out is strings of thousands of consecutive "random" alphanumeric characters like this:
wEPDwUKMTk0ODI3Nzk5MQ9kFgICAw9kFgYCAQ8WAh4H...
Here are two pages causing the problem. View source to see the long strings.
Any thoughts on what is causing this?
--FINAL SOLUTION--
I tested various regexes suggested in the answers. @FailedDev's answer helped and dropped processing time from a few minutes to a few seconds. @hakre's answer solved the problem and reduced processing time to a few hundred milliseconds. Below is the final regex I used. It's @hakre's second suggestion.
/[a-z0-9_\-\+]{1,256}+@[a-z0-9\-]{1,256}+\.([a-z]{2,3})(?:\.[a-z]{2})?/i