3

I need to create a query using the SUBSTRING function in Postgresql 9.x to extract a substring between a string and first pipe occurrence.

My source string is something like this:

THIS IS MY EXAMPLE|OTHER EXAMPLE|HELLO: Kevin|OTHER EXAMPLE|OTHER EXAMPLE

So I created this query:

SELECT SUBSTRING(myField from 'HELLO: (.*)\|') AS test FROM myTable

to get the word Kevin between the string 'HELLO: ' and the first occurrence of character pipe, but it doesn't work as intended because it returns Kevin|OTHER EXAMPLE|OTHER EXAMPLE.

Where am I going wrong?

Matt
  • 121
  • 1
  • 5
Diaboliko
  • 231
  • 1
  • 5
  • 21
  • 1
    Try using `HELLO: ([^|]+)`, like `SELECT SUBSTRING(myField from 'HELLO: ([^|]+)') AS test FROM myTable` – Wiktor Stribiżew Jan 26 '16 at 14:21
  • Perfect! Thanks a lot!!! – Diaboliko Jan 26 '16 at 14:23
  • Your pattern is almost right, you just need to understand, that `*` is [greedy](http://stackoverflow.com/questions/2301285/what-do-lazy-and-greedy-mean-in-the-context-of-regular-expressions). You can use its non-greedy (lazy) counterpart: `*?`, like `'HELLO: (.*?)\|'` (which works in PostgreSQL too). (And of course the posted answer is right too, you can get around greediness with altering the pattern logic too.) – pozs Jan 26 '16 at 16:19
  • Did my suggestion work in the end? Please let know if you need more help with this problem. – Wiktor Stribiżew Jul 24 '23 at 09:00

1 Answers1

5

You need to use a negated character class in order not to "overflow" to the next |-separated parts:

SELECT SUBSTRING(myField from 'HELLO: ([^|]+)') AS test FROM myTable

Here is a demo of how this regex works. The [^|]+ pattern matches one or more characters other than a literal pipe.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563