2

I am working on a request handler to route every page call through my index page and have SEO friendly urls.

domain.com/account/settings

this would be easy to map to the correct page but some are more complex when there becomes an ID number or a page number in the uri.

So I see some people will use things like preg_match and cycle through an array of patterns -> uri to get a match which is nice for when paging and id's come into play but from my experience, it seems like running preg_match on an array of say 20 items on every page load is not good for performace.

Please tell me your thought on this?

JasonDavis
  • 48,204
  • 100
  • 318
  • 537
  • i only use the first /directory/ after the domain in the index which then shoots it of to the section page that handles the rest of the url. you do want to keep our index optimised if all pages go through it. –  Jan 23 '11 at 03:27
  • Just a note that a lot of web frameworks use regular expressions for route matching. This seems like a reasonable endorsement. Many also use some type of memoizing/caching mechanism so the string doesn't actually need to be parsed each time. – Michael Mior Jan 23 '11 at 04:53

4 Answers4

2

Here's the thing you should be looking at: What's the better alternative, not that this one seems bad. If you have an alternative that seems better, they use it. If not, use the regex solution until you know you need to speed it up (I'll leave out my usual rant about premature optimization).

I would use the regex handlers personally. They are more flexible, easier, and easier to maintain than some other alternatives to this problem. But YMMV...

ircmaxell
  • 163,128
  • 34
  • 264
  • 314
2

A not-very-complex regex on a string as short as an URI will take very little time, even running 20 times through. If you have a) profiled or timed it to prove it's a performance problem, and b) a good alternative to use instead then you could try changing it around but otherwise I wouldn't worry to much about it. Tons of sites do something similar with mod_rewrite after all, checking the page URI against a series of regexes on every page load.

If need be, you could probably reduce that 20 times to less for each URI with a few simple strstr() checks to see what basic format the URI is in (whether it contains a id or not, a page number or not, etc). Optimizing your regexes, such as using the "start" ^ and "end" $ meta-characters wherever possible, will help too.

Michael Low
  • 24,276
  • 16
  • 82
  • 119
0

On my site I just have a few different rewrite rules, which I think works well:

RewriteRule ^(main|home|daily_photo_mockup|games|sporktris(?:_web)?|nangooni|contact|source|admin|edit_photos?|edit_galleries|edit_gallery)(?:_(fr|sv))?$ index.pl?page=$1&lang=$2
RewriteRule ^(snow_flakes|photography)(\d+)?(?:_p(\d+))?(?:_(fr|sv))?$ index.pl?page=$1&subpage=$2&img=$3&lang=$4
crimson_penguin
  • 2,728
  • 1
  • 17
  • 24
0

So I see some people will use things like preg_match and cycle through an array of patterns -> uri

I think, such approach is OK. For example, Drupal and Django frameworks do exactly that.

Another alternative is using URL Rewrite Engine (see, for example, this question for details).

Community
  • 1
  • 1
Kel
  • 7,680
  • 3
  • 29
  • 39