0

how can I transform a string column that has date and epoch format values into a date column? I'm working on a postgresql / Tableau prep environment.

Please find attached the example here

OK this query works:

SELECT case 
WHEN len(d.properties__renewal_date__value) > 10 
THEN (timestamp 'epoch' + CAST(NULLIF(d.properties__renewal_date__value, '') AS BIGINT)/1000 * interval '1 second')
ELSE to_date(d.properties__renewal_date__value, 'DD/MM/YYYY') end as renewal_date
  • Please copy and paste the values into your question instead of using a linked image. I would say the way to go about this is to create a new `date` column and transform the values in the existing column into it. The date formats could be done as `DD/MM/YYYY::date`. The epoch as `to_timestamp ( epoch_value)::date`. Distinguishing them would probably need to be done with regex, which I'm not up to. – Adrian Klaver Nov 01 '20 at 17:14

1 Answers1

0

Here is an encoding of @AdrianKlaver's comment. First determine which is the type of the current value with a regexp match and then transform accordingly.

select case 
   when "renewal date" ~ '^\d+$' then to_timestamp("renewal date"::double precision)::date
   else to_date("renewal date", 'dd/mm/yyyy')
 end as renewal_date
 from _table;

Redshift flavour, epoch value in milliseconds:

select case 
  when "renewal date" ~ '^\d+$' then 
    ('1970-01-01'::date + ("renewal date"::double precision/1000) * '1 second'::interval)::date
  else 
    to_date("renewal date", 'dd/mm/yyyy')
  end as renewal_date
  from _table;
Stefanov.sm
  • 11,215
  • 2
  • 21
  • 21
  • That is cool. The `to_date` for the date formatted values is unnecessary though. You can just do `"renewal date"::date;`. – Adrian Klaver Nov 01 '20 at 17:52
  • @AdrianKlaver Yes, you are right. I prefer to_date with format though because it is a bit safer as it does not depend on `SET DATESTYLE` configuration. – Stefanov.sm Nov 01 '20 at 17:58
  • That is true enough. – Adrian Klaver Nov 01 '20 at 18:01
  • I tried the above code and I get this error: Invalid operation: function to_timestamp(numeric) does not exist; Do you know why? – Giuseppe Giobbe Nov 01 '20 at 19:33
  • My omission. The function definition is `to_timestamp(double precision)` or no cast at all, not numeric. Pls. use `to_timestamp("renewal date"::double precision)::date`. I have noticed though that 1553817600000 is a *very large* value for epoch and yields 51208-07-09. `to_timestamp("renewal date"::double precision/1000)::date` yields 21-century dates. Maybe epoch values are in milliseconds? – Stefanov.sm Nov 01 '20 at 19:51
  • Should be *to_timestamp("renewal date"::bigint)::date*, The problem is the precision (seconds, microseconds, ...) and the base date for the epoch (std unix 1970-01-01, but also common (that I have seen anyway) 1900-01-01. You must get both of these. – Belayer Nov 01 '20 at 20:02
  • I get the same error -- Invalid operation: function to_timestamp (bigint) does not exist; -- with both bigint and double precision. Could it be because "renewal date" column is a string column? – Giuseppe Giobbe Nov 01 '20 at 20:12
  • No, do not use bigint. The [documentation](https://www.postgresql.org/docs/10/functions-datetime.html#FUNCTIONS-DATETIME-TABLE) is clear - `double precision`, seconds since 1970-01-01 00:00:00+00. Could you paste your code? – Stefanov.sm Nov 01 '20 at 20:28
  • my code: `SELECT case when d.properties__renewal_date__value ~ '^\d+$' then to_timestamp(d.properties__renewal_date__value::double precision)::date else to_date(d.properties__renewal_date__value, 'dd/mm/yyyy') end as renewal_date FROM hubspot.deals d`. And I get this error "SQL Error [500310] [42883]: [Amazon](500310) Invalid operation: function to_timestamp(double precision) does not exist;" – Giuseppe Giobbe Nov 01 '20 at 20:38
  • This query is perfectly fine for recent versions of Postgresql. What database/version are you using? – Stefanov.sm Nov 01 '20 at 20:45
  • I have postgresql version 12 and I am querying Redshift from DBeaver – Giuseppe Giobbe Nov 01 '20 at 20:50
  • Well, that should be it. [Redshift](https://docs.aws.amazon.com/redshift/latest/dg/c_redshift-and-postgres-sql.html) says 'Amazon Redshift and PostgreSQL have a number of very important differences that you must be aware of'. See [this](https://stackoverflow.com/questions/39815425/how-to-convert-epoch-to-datetime-redshift) – Stefanov.sm Nov 01 '20 at 20:56
  • I have edited my answer to include Redshift language flavour. You must tag your question 'Redshift' though. – Stefanov.sm Nov 01 '20 at 21:08
  • ok, almost there. I tried your query `select case when d.properties__renewal_date__value ~ '^\d+$' then ('1970-01-01'::date + (d.properties__renewal_date__value::double precision/1000) * '1 second'::interval)::date else to_date(d.properties__renewal_date__value, 'dd/mm/yyyy') end as renewal_date FROM hubspot.deals d` But I get [Amazon](500310) Invalid operation: value for "dd" in source string is out of range; What do you think is the problem? – Giuseppe Giobbe Nov 01 '20 at 21:16
  • Invalid date string in `properties__renewal_date__value` field probably – Stefanov.sm Nov 01 '20 at 21:25
  • Use a `where` clause in your query. – Stefanov.sm Nov 01 '20 at 21:40
  • Hi Stefanov, I haven't fully understood what the problem is. How should I use `where` in your query? – Giuseppe Giobbe Nov 01 '20 at 22:01