1

The simplified code:

SELECT 'ok' WHERE '/articles/new/' ~ '^/articles/(?!new)([\w-]+)/$';

Examples, what I want:

'/articles/new/' => ''
'/articles/new-york/' => 'ok'
'/articles/other-string/' => 'ok'

And, what's wrong:

'/articles/new/' => ''
'/articles/new-york/' => '' /* Wrong */
'/articles/other-string/' => 'ok'

So, how can I block ONLY the word?

Optimization

In PostgreSQL database, I have a table (page), and it has got columns path, title, file, regex_path etc.

The values in column path look like this:

/
/articles/
/articles/:category-code/
/articles/:category-code/:article-code/
/articles/:category-code/:article-code/edit/
/articles/new/
/members/
/members/:username/

: means it's a parameter (PHP gets the name and content based on regex_path – the first version)

From outside (PHP) the database gets a value (URL). Examples:

/ /* Main page */
/articles/ /* List all article */
/articles/it/ /* List articles in IT category */
/articles/it/ipad-mini-2/  /* Article */
/articles/it/ipad-mini-2/edit/ /* Edit article */
/articles/new/ /* New article */
/members/ /* Member list */
/members/someone/ /* Member datasheet */

How can I select the right row, where the value (URL) matches with the path (the fastest way, without regex)?

Example

In: /articles/it/ipad-mini-2/

Out: the row, where path is: /articles/:category-code/:article-code/

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
bsz
  • 61
  • 8
  • It's unclear whether you want to block only `'/articles/new/'` or all `'/articles/new/.*'`. It's also unclear whether a leading '/articles/' is required. – Erwin Brandstetter Jul 10 '13 at 14:54
  • The updated question is too broad now, and on the other hand does not provide necessary details. – Erwin Brandstetter Jul 10 '13 at 16:42
  • @ErwinBrandstetter: I modified it again, now? – bsz Jul 10 '13 at 17:09
  • @ErwinBrandstetter: it is complete, work with the first solution. `$ sql = "SELECT * FROM page WHERE :rurl ~ regex_path"; $params = [':rurl' => cc::$data['req_path']];` – bsz Jul 10 '13 at 17:21

2 Answers2

3

This is faster:

SELECT *
FROM   tbl 
WHERE  txt LIKE '/articles/%'
AND    txt <>   '/articles/new/';  -- only exclude this exact string

Take everything that starts with '/articles/', but exclude '/articles/new/'

Or, to exclude the whole branch:

...
AND    txt NOT LIKE  '/articles/new/%';

Either includes '/articles/new-york/' because, here, 'new' is not immediately followed by '/'.

Performance test

Regular expressions are typically more expensive than LIKE or = or <>.
Test with EXPLAIN ANALYZE. Or EXPLAIN (ANALYZE, TIMING OFF) to suppress noise. See:

I ran a test to confirm my claim. It's ~ 10 - 20 x faster than the regular expression.

'^/articles(?!/new/)/([\w-]+)/$' -- Daniel's solution

.. which should be simplified to (a bit faster):

'^/articles/(?!new/)[\w-]+/$'

db<>fiddle here
Old sqlfiddle

Explanation for the 2nd regexp:

^ ... start of string
/articles/ ... literal match
(?!new/) ... negative lookahead matches where no string 'new/' follows
[\w-] ... bracket expression containing the class shorthand \w plus the dash -
+ ... 1 or more times
/ ... literal match
$ ... end of string

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • The `path` row values: `/ /articles/ /articles/:category-code/ /articles/:category-code/:article-code/ /articles/:category-code/:article-code/edit/ /articles/new/ /members/ /members/:username/` Without regex, how can I select the right row? – bsz Jul 10 '13 at 15:13
  • @bsz: I added a bit to my answer. I suppose you edit your question to clarify the requirements. – Erwin Brandstetter Jul 10 '13 at 15:17
  • @ErwinBrandstetter `'^/articles(?!/new/)/([\w-]+)/$'` I don't understand the fourth /; I also don't understand [\w-], Can you explain these two? – jian Nov 25 '21 at 06:04
  • @JianHe: I clarified above. – Erwin Brandstetter Nov 25 '21 at 07:30
1

You were close, I changed the negated look ahead to be more specific. It fails to match only if the look ahead finds /new/.

^/articles(?!/new/)/([\w-]+)/$

The Results

/articles/new
/articles/new/
/articles/new-york/ => OK
/articles/other-string/ => OK
/articles/notnew/ => OK

Test is here:

REY

Daniel Gimenez
  • 18,530
  • 3
  • 50
  • 70