Simple regex to filter out pre and suffix characters

Question

I have a field in my database which has a long list of strings separated by commas. Here are few row examples:

HAB
DHAB,RAB,DAB
HAB,RAB,DAB
RAB,HAB, 
RAB,HAB,DAB

My query has the following condition:

WHERE description LIKE '%HAB%'

But it returns the second row which has 'DHAB'.
Can it be done using regex with the WHERE statement so that I only get entries which have 'HAB' in the list (one string) and not the entries with 'DHAB'?

Thank you for the response. It is working on the database level. But for some when I test it in regexr.com, it does not include the case where HAB is at the begining of the line or at the end of the line. Is there anything I should be aware of? — ooo, Nov 13 '18 at 19:23
You are testing against a single line with linebreaks there, enable `m` modifier. — Wiktor Stribiżew, Nov 13 '18 at 19:28
Sorry. I was testing on a string that had multiple \n in it. my bad. — ooo, Nov 13 '18 at 19:29

score 1 · Accepted Answer · answered Nov 13 '18 at 19:26

1

You may use

WHERE description ~ '(^|,)HAB($|,)'

The regex matches

(^|,) - start of string or a ,
HAB - literal substring
($|,) - end of string or ,

See the online regex demo.

answered Nov 13 '18 at 19:26

Wiktor Stribiżew

607,720
39
448
563

score 1 · Answer 2 · answered Nov 14 '18 at 02:23

Regular expressions are powerful and versatile, but also expensive. Consider a different approach: transform the list to an actual array with string_to_array() and then:

WHERE 'HAB' = ANY (string_to_array(description, ',')

Or:

WHERE  string_to_array(description, ',') @> '{HAB}'

db<>fiddle here

The latter can be supported with a GIN index, which makes it faster by orders of magnitude for big tables.

CREATE INDEX ON tbl USING gin (string_to_array(description, ','));

Can PostgreSQL index array columns?

Or consider a normalized DB design replacing the comma-separated values with a 1:n relationship. Related:

Simple regex to filter out pre and suffix characters

2 Answers2