0

I have following data :

a.tb_name   a.start_time    a.end_time  a.createdon total_time_min  a.modifiedon    a.byuser
_2a 2018-02-15 13:17:04.795              2/15/2018          smt
_2a                         2018-02-15 13:19:11.11      2/15/2018           smt

start & endtime are in strings ( I can use any datatype) . The output I need is total time in minutes:

a.tb_name   a.start_time    a.end_time               a.createdon    total_time_min  a.modifiedon    a.byuser
_2a 2018-02-15 13:17:04.795                          2/15/2018           smt
_2a                         2018-02-15 13:19:11.11   2/15/2018           smt
_2a 2018-02-15 13:17:04.795 2018-02-15 13:19:11.11   2/15/2018         2 smt       

I tried this but as both the time data is in two diffrent columns I am unable to get the result.

select  minute(cast(end_time as TIMESTAMP) - cast(start_time as TIMESTAMP)) from a where tb_name like "_2a";
DrSD
  • 151
  • 2
  • 12
  • have you tried this? https://stackoverflow.com/questions/33593080/how-to-get-date-difference-in-minutes-using-hive – jose_bacoy Feb 15 '18 at 20:53
  • Row 1 has starttime & row 2 has end time , I need difference between end & start time, hence stuck ! – DrSD Feb 15 '18 at 21:00
  • DrSD; are the records come in pairs? Thanks. – jose_bacoy Feb 15 '18 at 21:33
  • No they dont come in pairs, 1st starttime is recorded n then endtime, but if we could self join and get them in pairs then I guess we can do something – DrSD Feb 15 '18 at 22:21

1 Answers1

0

i used the windowing functions max, min and row_number. Hope it helps. Thanks.

Query:
select tb_name,
start_time,
end_time,
'' as total_time_min,
createdon,
byuser 
from t2
union all
select tb_name,
max_start_time as start_time,
max_end_time as end_time,
cast((minute(cast(max_end_time as TIMESTAMP) - cast(max_start_time as TIMESTAMP))) as string) as total_time_min,
createdon,
byuser 
from (
select tb_name, 
start_time,
end_time,
max(start_time) over (partition by tb_name) as max_start_time,
max(end_time) over (partition by tb_name) as max_end_time,
row_number() over (partition by tb_name) as rnk,
createdon,
byuser 
from t2
) t1 where t1.rnk=1;

Result:
tb_name start_time              end_time            total_time_min  createdon       byuser
_2a     2018-02-15 13:17:04.795                                     20180215        smt
_2a                             2018-02-15 13:19:11.11              20180215        smt
_2a     2018-02-15 13:17:04.795 2018-02-15 13:19:11.11  2           20180215        smt
jose_bacoy
  • 12,227
  • 1
  • 20
  • 38
  • Thanks a ton !! ... This seems to be right, the issue i face is with -cast((minute(cast(max_end_time as TIMESTAMP) - cast(max_start_time as TIMESTAMP))) as string) as total_time_min, I get error: Error while compiling statement: FAILED: SemanticException line 0:undefined:-1 Wrong arguments 'max_start_time': No matching method for class org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPMinus with (timestamp, timestamp) – DrSD Feb 16 '18 at 02:16
  • Can you please share what was the datatype of the starttime & endtime you used? – DrSD Feb 16 '18 at 15:30
  • start & endtime are in strings since it is the datatype that you gave in the example. Thanks. – jose_bacoy Feb 16 '18 at 15:40
  • OK, I have used string as well but I get an error. - am using HUe 4.0 . Error while compiling statement: FAILED: SemanticException line 0:undefined:-1 Wrong arguments 'max_start_time': No matching method for class org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPMinus with (timestamp, timestamp) – DrSD Feb 16 '18 at 16:43
  • done , i used this and i got the correct output !! concat(cast(floor((max_end_time - max_start_time) / 60 / 60) as string), from_unixtime(max_end_time - max_start_time, ':mm:ss')) – DrSD Feb 16 '18 at 18:11