Disclaimer: I know from this answer that regex isn't great for U.S. addresses since they're not regular. However, this is a fairly small project and I want to see if I can reduce the number of false positives.
My challenge is to distinguish (i.e. match) between addresses like "123 SOUTH ST" and "123 SOUTH MAIN ST". The best solution I can come up with is to check if more than 1 word comes after the directional word.
My python regex is of the form:
^(NORTH|SOUTH|EAST|WEST)(\s\S*\s\S*)+$
Explanation:
^(NORTH|SOUTH|EAST|WEST)
matches direction at the start of the string(\s\S*\s\S*)+$
attempts to match a space, a word of any length, another space, and another word of any length 1 or more times
But my expression doesn't seem to distinguish between the 2 types of term. Where's my error (besides using regex for U.S. addresses)?
Thanks for your help.