1

I'm still learning SQL and I found a couple of solutions on SQL Server or Postgreы, but it doesn't seen to work on HUE DATEDIFF, only allows me to calculate difference between days seconds, minutes are not available. Help is very welcome.

I was able to split the timestamp with substring_index, but then I can't find the right approach to compare and subtract start_time to end_time in order to obtain the accurate account of seconds. I can't find time functions so I'm assuming I should calculate it based on timestamp. obtained as

from_unixtime(unix_timestamp(start_time, "yyyy-MM-dd'T'HH:mm:ss.SSSSSS"), 'yyyy-MM-dd HH:mm:ss')


substring_index(start_time, 'T', -1)s_tm,
substring_index(end_time, 'T', -1)e_tm


start_date 2018-06-19 13:59:41  
end_date   2018-06-19 14:01:17

desired output

01:36

leftjoin
  • 36,950
  • 8
  • 57
  • 116

1 Answers1

0

Solution for Hive.

Difference in seconds:

select UNIX_TIMESTAMP('2018-06-19T14:01:17.000000',"yyyy-MM-dd'T'HH:mm:ss.SSSSSS")-
   UNIX_TIMESTAMP('2018-06-19T13:59:41.000000',"yyyy-MM-dd'T'HH:mm:ss.SSSSSS") as seconds_diff

Result:

96

Now calculate difference in HH:mm:ss:

select concat_ws(':',lpad(floor(seconds_diff/3600),2,'0'),        --HH
                     lpad(floor(seconds_diff%3600/60),2,'0'),     --mm
                     lpad(floor(seconds_diff%3600%60),2,'0')      --ss
       )

from
(
select --calculate seconds difference
       UNIX_TIMESTAMP('2018-06-19T14:01:17.000000',"yyyy-MM-dd'T'HH:mm:ss.SSSSSS")-
       UNIX_TIMESTAMP('2018-06-19T13:59:41.000000',"yyyy-MM-dd'T'HH:mm:ss.SSSSSS") as seconds_diff
) s

Result:

OK
00:01:36
Time taken: 1.071 seconds, Fetched: 1 row(s)

See also this answer about format convertion: https://stackoverflow.com/a/23520257/2700344

leftjoin
  • 36,950
  • 8
  • 57
  • 116
  • Thanks, but in that way I get to all lines the same result and if I try to replace the date to the column_name as below is doesn't work. Not sure if it is something obvious that I'm missing ... from_unixtime(UNIX_TIMESTAMP(end_time)-UNIX_TIMESTAMP(start_time),'mm:ss') – user11925123 Aug 14 '19 at 15:21
  • @user11925123 if it is NULL then most probably the format is not like in your example : 2018-06-19 13:59:41. This is why I gave you the link to the answer about format convertion. All you need is to specify correct format to the UNIX_TIMESTAMP as a second parameter – leftjoin Aug 14 '19 at 15:29
  • Hey, the original format is "yyyy-MM-dd'T'HH:mm:ss.SSSSSS". When I use from_unixtime(unix_timestamp(end_time, "yyyy-MM-dd'T'HH:mm:ss.SSSSSS"), 'mm')-from_unixtime(unix_timestamp(start_time, "yyyy-MM-dd'T'HH:mm:ss.SSSSSS"), 'mm') it works if the end and starting hour is the same, however when the end hour is like 17:16:04 and starting hour is 16:58:05 the subtraction is incorrect. If I try using with HH:MM instead of just MM, then I get null. Any tips? Thanks in advance. – user11925123 Aug 17 '19 at 14:11
  • @user11925123 Yeah, format conversion works weird. Use explicit math. – leftjoin Aug 17 '19 at 14:39
  • @substract seconds, then calculate HH, mm, ss. See fixed answer – leftjoin Aug 17 '19 at 14:43
  • @user11925123 You are wellcome! Please do not forget to accept/vote – leftjoin Aug 17 '19 at 15:37
  • Hello again leftjoin, I´m having issues to obtain aggregated results as the output is string and I think I need an INT. I tried converting it to int, but them I get the hour,minutes and seconds split in different columns again. I could use just minutes, but them again for the avg I would need it all in one. Sorry for asking, but I just can't see my way out of it I've trying for some time. Is there any tip you could give as how to make it easier as to work with the elapsed time as to calculated median, avg , etc? Many thanks in advance! – user11925123 Aug 18 '19 at 13:51
  • @user11925123 Use difference in seconds (see first query), aggregate it as you want and after aggregation convert to HH:mm:ss string – leftjoin Aug 18 '19 at 19:38