0

I need to implement a regular expression (as I understand) matching in PostgreSQL 8.4. It seems regular expression matching are only available in 9.0+.

My need is:

When I give an input 14.1 I need to get these results:

14.1.1
14.1.2
14.1.Z
...

But exclude:

14.1.1.1
14.1.1.K
14.1.Z.3.A
...

The pattern is not limited to a single character. There is always a possibility that a pattern like this will be presented: 14.1.1.2K, 14.1.Z.13.A2 etc., because the pattern is provided the user. The application has no control over the pattern (it's not a version number).

Any idea how to implement this in Postgres 8.4?

After one more question my issue was solved:

Escaping a LIKE pattern or regexp string in Postgres 8.4 inside a stored procedure

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Nandakumar V
  • 4,317
  • 4
  • 27
  • 47
  • 3
    8.4 (which is no longer maintained) has regular expressions as well: http://www.postgresql.org/docs/8.4/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP –  Apr 02 '15 at 07:31
  • As always: the exact data type would help to clarify. Better yet, the table definition (`\d tbl` in psql). – Erwin Brandstetter Apr 02 '15 at 08:44

3 Answers3

5

Regular expression matching has been in Postgres practically for ever, at least since version 7.1. Use the these operators:

~ !~ ~* !~*

For an overview, see:

The point in your case seems to be to disallow more dots:

SELECT *
FROM   tbl
WHERE  version LIKE '14.1.%'        -- for performance
AND    version ~ '^14\.1\.[^.]+$';  -- for correct result

db<>fiddle here
Old sqlfiddle

The LIKE expression is redundant, but it is going to improve performance dramatically, even without index. You should have an index, of course.

The LIKE expression can use a basic text_pattern_ops index, while the regular expression cannot, at least in Postgres 8.4.
Or with COLLATE "C" since Postgres 9.1. See:

[^.] in the regex pattern is a character class that excludes the dot (.). So more characters are allowed, just no more dots.

Performance

To squeeze out top performance for this particular query you could add a specialized index:

CREATE INDEX tbl_special_idx ON tbl
((length(version) - length(replace(version, '.', ''))), version text_pattern_ops);

And use a matching query, the same as above, just replace the last line with:

AND   length(version) - length(replace(version, '.', '')) = 2

db<>fiddle here
Old sqlfiddle

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 1
    What about using arrays? `array_length(string_to_array(col, '.'), 1) = 3 and version like '14.1.%'` –  Apr 02 '15 at 08:19
  • @a_horse_with_no_name: Works just fine: http://sqlfiddle.com/#!15/9eecb/286 Slightly more expensive I'd expect. – Erwin Brandstetter Apr 02 '15 at 08:31
  • I wasn't sure if the array handling or the regex would be more expensive –  Apr 02 '15 at 08:37
  • @a_horse_with_no_name: Handling alone might beat a regex in pg 8.4, but if we need to convert the string first, I am not so sure. There is a cheaper way, similar to your idea. I added a chapter. – Erwin Brandstetter Apr 02 '15 at 09:02
  • Thanks for the input, that definitely solved the problem. But I was going to write it in a stored procedure, but dont know how to escape the string within the stored procedure `14.1.` => `14\.1\.` – Nandakumar V Apr 02 '15 at 09:47
  • @NandakumarV: Start a *new question* and show what you have (even if it's not working) You'll get solutions promptly. you can always link to this question for context. – Erwin Brandstetter Apr 02 '15 at 09:51
  • @ErwinBrandstetter Thanks for the advice. I have posted another question on that [Escaping a string in postgres 8.4 inside an stored procedure](http://stackoverflow.com/questions/29411869/escaping-a-string-in-postgres-8-4-inside-an-stored-procedure) – Nandakumar V Apr 02 '15 at 11:33
0

You can't do regex matching, but I believe you can do like operators so:

SELECT * FROM table WHERE version LIKE '14.1._';

Will match any row with a version of '14.1.' followed by a single character. This should match your examples. Note that this will not match just '14.1', if you needed this as well. You could do this with an OR.

SELECT * FROM table WHERE version LIKE '14.1._' OR version = '14.1';
Yule
  • 9,668
  • 3
  • 51
  • 72
  • The characters is not limited to a single character, code like this `14.1.23` can also come. – Nandakumar V Apr 02 '15 at 07:39
  • Also the underscore thing is it a patterm matching mechanism or something. I didnt understand how `LIKE '14.1._'` will match `14.1.2` – Nandakumar V Apr 02 '15 at 07:41
  • _ matches any single character. so would match all your examples in your question. It wont match 14.1.23. – Yule Apr 02 '15 at 07:50
0

Regex matching should be possible with Postgresql-8.4 like this:

SELECT * FROM table WHERE version ~ '^14\.1\..$';
plaes
  • 31,788
  • 11
  • 91
  • 89