I'm trying to parse a file and analyze it. To do this, I've used preg_split() to break the document into an array. I only want words in the array (otherwise alpha characters). The regular expression I used is:
$noAlpha = "/[\s]+|[^A-z]+|\W|\r/";
However, I'm getting instances of blanks in the array. I believe it has to do with a line with a return only (\r
) and nothing else on it.
I'm only using .txt files. What would I need to add to the regex to account for this?