9

My Hive table has a date column with UTC date strings. I want to get all rows for a specific EST date.

I am trying to do something like the below:

Select * 
from TableName T
where TO_DATE(ConvertToESTTimeZone(T.date))  = "2014-01-12" 

I want to know if there is a function for ConvertToESTTimeZone, or how I can achieve that?

I tried the following but it doesnt work (my default timezone is CST):

TO_DATE(from_utc_timestamp(T.Date) = "2014-01-12" 
TO_DATE( from_utc_timestamp(to_utc_timestamp (unix_timestamp (T.date), 'CST'),'EST'))

Thanks in advance.

Update:

Strange behavior. When I do this:

select "2014-01-12T15:53:00.000Z", TO_DATE(FROM_UTC_TIMESTAMP(UNIX_TIMESTAMP("2014-01-12T15:53:00.000Z", "yyyy-MM-dd'T'hh:mm:ss.SSS'Z'"), 'EST')) 
from TABLE_NAME T1
limit 3

I get

    _c0                          _c1
0   2014-01-12T15:53:00.000Z    1970-01-16
1   2014-01-12T15:53:00.000Z    1970-01-16
2   2014-01-12T15:53:00.000Z    1970-01-16
Gadam
  • 2,674
  • 8
  • 37
  • 56

2 Answers2

17

Your system timezone CST doesn't matter for converting UTC to EST in Hive. You should be able to get the correct results with:

TO_DATE(FROM_UTC_TIMESTAMP(UNIX_TIMESTAMP(T.date, "yyyy-MM-dd'T'hh:mm:ss.SSS'Z'") * 1000, 'EST'))

Note that because UNIX_TIMESTAMP returns seconds, you will lose the millisecond component of your timestamp.

Jeremy Beard
  • 2,727
  • 1
  • 20
  • 25
  • Thanks Jeremy, but it did not work. If it helps, an example for T.date is '2014-01-12T15:53:00.000Z'. When I put your suggestion in the where clause, it returns 0 records. If I put it in the select part, it gives NULL for that column. :| – Gadam Feb 16 '15 at 21:07
  • It looks like your date strings are not in the default format that Hive expects. Instead, you can pass a second argument to UNIX_TIMESTAMP that includes the date format of your field, as defined in http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html. Also see here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions – Jeremy Beard Feb 16 '15 at 21:11
  • 1
    Aha yes, Hive assumes milliseconds when converting INT to TIMESTAMP, so we need to add a `* 1000` in there. I've updated my answer. – Jeremy Beard Feb 16 '15 at 23:14
  • That solved my issue. I am selecting this as the answer. But can you update your answer with a little more detail; is this an issue with Hive? When I am providing the milliseconds as part of my date, why do I have to again multiply by 1000 to convert the output to millis.. thanks. – Gadam Feb 17 '15 at 21:34
  • I've mentioned at the bottom of my answer that UNIX_TIMESTAMP in Hive returns seconds. Unfortunately this is just how Hive's built-in functions work, as disappointing as that is. – Jeremy Beard Feb 17 '15 at 21:37
  • There seems to be an error here. ETC is just another way of saying UTC. I think you mean EST. Running your code as is returns the input unchanged; replacing the 'ETC' in FROM_UTC_TIMESTAMP with 'EST' returns correct results. – bsg Dec 25 '15 at 03:15
1

This converts to CST with the daylight savings hour shift:

to_date(FROM_UTC_TIMESTAMP(UNIX_TIMESTAMP(eff_timestamp, "yyyy-MM-dd'T'hh:mm:ss.SSS'Z'") * 1000, 'CST6CDT')) 
Aaron
  • 2,367
  • 3
  • 25
  • 33
Carl
  • 81
  • 4