3

I have a table that has the column hour that is not in the right timestamp format:

        hour         
---------------------
 2014-12-06-01:44:35
 2014-12-06-01:44:35
 2014-12-06-01:44:35
 2014-12-06-01:44:35
 2014-12-06-01:44:35
 2014-12-06-01:44:35

I want to replace - with ' 'in the middle, so it would look like this:

        hour         
---------------------
 2014-12-06 01:44:35
 2014-12-06 01:44:35
 2014-12-06 01:44:35
 2014-12-06 01:44:35
 2014-12-06 01:44:35
 2014-12-06 01:44:35

I've tried this:

select regexp_replace(hour, '\d{4}-\d{2}-\d{2}(-)\d{2}:\d{2}:\d{2}', ' ')
from my_table;

But unfortunately it doesn't make any difference.
I'm using Redshift, not PostgreSQL.

Zeki
  • 5,107
  • 1
  • 20
  • 27
Vor
  • 33,215
  • 43
  • 135
  • 193

3 Answers3

4

There are a few things to think about when creating Regex expressions in postgresql. First, you need to escape backslashes. So instead of \, you need \\. Second, you can reference capturing groups by using \n to capture the nth group in parentheses. You can use the capturing group in your replace value to reference parts of the original string.

Putting all of this together, this is what you need:

regexp_replace('hour', '(\\d{4}-\\d{2}-\\d{2})-(\\d{2}:\\d{2}:\\d{2})', '\\1 \\2') 
Zeki
  • 5,107
  • 1
  • 20
  • 27
3

Limited options in Redshift

regexp_replace(hour, '(^\\d{4}-\\d{2}-\\d{2})-(\\d{2}:\\d{2}:\\d{2}$)', '\\1 \\2') AS a
regexp_replace(hour, '(^\\d{4}-\\d\\d-\\d\\d)-(\\d\\d:\\d\\d:\\d\\d)$', '\\1 \\2') AS b
regexp_replace(hour, '(^[\\d-]{10})-([\\d:]+)$', '\\1 \\2')                        AS c
left(hour,10) || ' ' ||  substring(hour FROM 12)                           AS e

More options in modern Postgres (9.1+)

regexp_replace(hour, '(^\d{4}-\d{2}-\d{2})-(\d{2}:\d{2}:\d{2}$)', '\1 \2') AS a
regexp_replace(hour, '(^\d{4}-\d\d-\d\d)-(\d\d:\d\d:\d\d)$', '\1 \2')      AS b
regexp_replace(hour, '(^[\d-]{10})-([\d:]+)$', '\1 \2')                    AS c
reverse(regexp_replace(reverse(hour), '-', ' '))                           AS d
left(hour,10) || ' ' ||  right(hour, -11)                                  AS e
overlay(hour placing ' ' from 11)                                          AS f
to_timestamp(hour, 'YYYY-MM-DD-HH24:MI:SS')                                AS ts

SQL Fiddle.

From "restrictive" to "cheap" in order of appearance. ts is special.

a

That's like the currently accepted answer by @Zeki, completed with anchors at start and end with ^ and $ to make it even less ambiguous and potentially faster.

You want the special meaning of \d as class shorthand for digits.
In Postgres, do not escape backslashes \ with \\. That would be incorrect unless you are running with the long outdated, non-default setting standard_conforming_strings = off.
Redshift is stuck at an old stage of development and does just that. Backslashes are interpreted unless escaped with another backslash.

b

\d\d is shorter and cheaper than \d{2}.

c

Simplify with a character classes: digits + hyphen: [\d-] and digits + colon: [\d:].

d

Since regexp_replace() without 4th parameter 'g' only replaces the first match, you can reverse() the string, replace the first hyphen and reverse() back.
Doesn't work in Redshift, since it uses a simpler version of regexp_replace() that always replaces all occurrences.

e

If the format is fixed as shown, just take the first 10 character, a blank and the rest of the string.
Redshift uses simpler versions of left() and right() that don't accept negative parameters, so I substituted with substring().

f

Or, simpler yet, just overlay() the 11th character with a blank.
Not implemented in Redshift.

ts

Unlike the rest, to_timestamp() returns a proper timestamp with time zone type, not text. You can assign the result to timestamp without time zone just as well. Details.. By far the best option if you want to convert your string. Not implemented in Redshift.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Unfortunately non of the queries gave me the right answer. First 3 (`a`, `b`, `c` ) simply didn't change the result. 4'th (`d`) - removed all the `-` so the timestamp looked like this `2014 12 06 01:44:35`. 5-th (`e`) returned an error `context: Negative value given!`. And the last one (`f`) returned, that it doesn't exist: `function pg_catalog.overlay(character varying, "unknown", integer) does not exist`. – Vor Dec 10 '14 at 14:29
  • @Vor: Ah well, sorry, more limitations of Redshift. As you can see in the fiddle it all works in modern Postgres. I added some working versions for Redshift and explanation. – Erwin Brandstetter Dec 10 '14 at 15:35
2
(\d{4}-\d{2}-\d{2})-(\d{2}:\d{2}:\d{2})

Try this.Replace by $1 $2.See demo.

https://regex101.com/r/iY3eK8/13

vks
  • 67,027
  • 10
  • 91
  • 124
  • They output is exactly what I'm looking for. But can you guide me through what should I change in my sql command ( beside the regex part ) – Vor Dec 09 '14 at 15:23
  • @Vor beside regex change the replacement string.`$1 $2` or `\1 \2` – vks Dec 09 '14 at 15:24
  • I've tryed `regexp_replace(hour, '(\d{4}-\d{2}-\d{2})-(\d{2}:\d{2}:\d{2})', '\1 \2')` and `regexp_replace(hour, '(\d{4}-\d{2}-\d{2})-(\d{2}:\d{2}:\d{2})', '$1 $2')` but it still doesn't give me the right result. – Vor Dec 09 '14 at 15:29
  • Thank you very much for your help, your answer was correct, but like @Zeki suggested, I had to escape it backslashes. – Vor Dec 09 '14 at 15:32