1

I have values like this: 1ST, 2ND, FIRST, and I want to remove the 'ST' and 'ND' ONLY if what comes before is a digit.

I am running postgres 9.5 and I have a positive lookbehind working in SQL Fiddle but it only works on 9.6

SELECT ex,
regexp_replace(ex, '(?<=[0-9]+)(TH|ST|ND|RD)', '', 'gi') as test 
FROM t1

Is there any other way to do this besides using a CASE statement like this:

SELECT ex,
(CASE WHEN ex ~ '\d(TH|ST|ND|RD)' THEN regexp_replace (ex, 'TH|ST|ND|RD', '','gi') ELSE ex end) as test_case
FROM t1

Any suggestions would be appreciated. Thanks!

moonshot
  • 649
  • 1
  • 5
  • 13
  • Does my answer below help? Please provide feedback if you need any adjustments – Wiktor Stribiżew Jul 19 '18 at 11:40
  • This is great thank you! I'm new to both backreferences and word boundaries. So you basically use a positive lookahead **(?:TH|ST|ND|RD)** to match and capture and then replace the entire thing with the backreference to the digit ([0-9]). Thanks again! – moonshot Jul 19 '18 at 12:41
  • `(?:TH|ST|ND|RD)` is a [*non-capturing group*](https://stackoverflow.com/questions/3512471) that does not create a special memory buffer and that you cannot backreference to. Then, you are correct. – Wiktor Stribiżew Jul 19 '18 at 12:51

1 Answers1

1

You may match and capture the digit and replace with a backreference to the value. Also, I suggest adding a word boundary after the ordinal numeral suffixes to make sure we are matching them at the end of the word.

SELECT regexp_replace(ex, '([0-9])(?:TH|ST|ND|RD)\y', '\1', 'gi') as test_case FROM t1

See the updated SQLFiddle.

CREATE TABLE t1
    (ex varchar)
;

INSERT INTO t1
    (ex)
VALUES
    ('1ST'),
    ('2ND'),
    ('3RD'),
    ('4TH'),
    ('FIRST'),
    ('FOURTH')
 ;

SELECT regexp_replace(ex, '([0-9])(?:TH|ST|ND|RD)\y', '\1', 'gi') as test_case FROM t1

enter image description here

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563