Remove Consecutive Repeated Characters with PostgreSQL regexp_replace function

Question

Remove all consecutive repeated characters using Regular Expression.

In Javascript this works well:

txt='aaa bbb 888 bbb ccc ddd'.replace(/(?!(?!(.)\1))./g,'');

Returns 'a b 8 b c d'

How can I do it with Posgresql regexp_replace function? This won't work:

SELECT regexp_replace('aaa bbb 888 bbb ccc ddd',E'(?!(?!(.)\\\\1)).','g');

$ psql -c "SELECT regexp_replace('aaa bbb 888 bbb ccc ddd',E'(?!(?!(.)\\1)).','g');"
     regexp_replace      
-------------------------
 aaa bbb 888 bbb ccc ddd
(1 row)

$ psql -c "SELECT regexp_replace('aaa bbb 888 bbb ccc ddd','(?!(?!(.)\1)).','g');"   
ERROR:  invalid regular expression: invalid backreference number

What am I doing wrong?

`select string_agg(x[1],'') from regexp_matches('aaa bbb 888 111122222 1234 bbb ccc ddd','(.)\1*','g') t(x);` — Abelisto, Aug 16 '16 at 07:46
@Abelisto This returns `a b 8 12 1234 b c d`, which isn't exactly what the OP wanted, but I upvoted your comment because it is close. — Tim Biegeleisen, Aug 16 '16 at 07:49
@TimBiegeleisen It is because different input to test some more use cases. — Abelisto, Aug 16 '16 at 07:52
The OP wants only the first character of every word (contain it numbers or letters), at least that is how I read it. — Tim Biegeleisen, Aug 16 '16 at 07:55
@TimBiegeleisen To get the first character of every word (not just delete repeated characters): `select string_agg(x,'') from unnest(array(select regexp_matches('aaa bbb 888 111122222 1234, bbb ccc ddd','\m(\w)\w*(\W*)','g'))) t(x);` — Abelisto, Aug 16 '16 at 08:23
@TimBiegeleisen I am confused. Just after copy/paste to the psql it does not work. But after delete and reenter `(\W*)‌` part it works fine. — Abelisto, Aug 16 '16 at 08:36
@Abelisto - You should put this down as an answer. Your first comment perfectly solves the problem. — Addison, Nov 28 '16 at 05:35

score 5 · Answer 1 · edited May 23 '17 at 10:29

5

There's a similar SO question that can help you to get the answer:

SELECT regexp_replace('aaa bbb 888 bbb ccc ddd', '(.)\1{1,}', '\1', 'g');
 regexp_replace 
----------------
 a b 8 b c d
(1 row)

It uses a backreference to capture the groups of repeating characters.

edited May 23 '17 at 10:29

Community

1
1

answered Jan 22 '17 at 00:38

mgamba

1,189
8
10

Remove Consecutive Repeated Characters with PostgreSQL regexp_replace function

1 Answers1

Linked