How would one efficiently match one input string against any number of regular expressions?
One scenario where this might be useful is with REST web services. Let's assume that I have come up with a number of URL patterns for a REST web service's public interface:
/user/with-id/
{userId}
/user/with-id/
{userId}
/profile
/user/with-id/
{userId}
/preferences
/users
/users/who-signed-up-on/
{date}
/users/who-signed-up-between/
{fromDate}
/and/
{toDate}
- …
where {…}
are named placeholders (like regular expression capturing groups).
Note: This question is not about whether the above REST interface is well-designed or not. (It probably isn't, but that shouldn't matter in the context of this question.)
It may be assumed that placeholders usually do not appear at the very beginning of a pattern (but they could). It can also be safely assumed that it is impossible for any string to match more than one pattern.
Now the web service receives a request. Of course, one could sequentially match the requested URI against one URL pattern, then against the next one, and so on; but that probably won't scale well for a larger number of patterns that must be checked.
Are there any efficient algorithms for this?
Inputs:
- An input string
- A set of "mutually exclusive" regular expressions (ie. no input string may match more than one expression)
Output:
- The regular expression (if any) that the input string matched against.