0

I need to split each row of data at an hour level given an hour difference. I start with the following data :

enter image description here

and i would like to output this result :

enter image description here

Tally is the number of times an event occurs in between the two dates, New_date would be the start of each segment and new_tally a division of the original tally following the proportion of time for each segment.

I have tried executing the sqlfiddle in this other question, but i get the following error:

ORA-01436: CONNECT BY loop in user data

Exact code from the fiddle:

    create table log_table
        ( dt1 date, dt2 date
        );

    insert into log_table values(sysdate,sysdate-5/24);

    select greatest(dt2, trunc(dt2+(level-1)/24, 'hh24')), least(dt1, trunc(dt2+(level)/24, 'hh24'))
    from log_table connect by level <= floor((dt1-dt2)*24)+1;

This might have to do with the fact that i am trying this in an Oracle 8i DB. I've also been trying to understand how hierarchical queries work but i find them really complicated.

Could someone give me an example and explain what is happening? Thanks in advance.

Nega.

Community
  • 1
  • 1
Negashion
  • 127
  • 1
  • 9
  • Can you add the code you tried; or a link to the Fiddle you got an error on, which presumably includes your data set-up and the query? – Alex Poole Nov 17 '15 at 16:43
  • This is the fiddle http://sqlfiddle.com/#!4/82625/29 .It is not my code though, it is the one in the question i linked. – Negashion Nov 17 '15 at 17:10
  • I don't get the ORA-01436 from that Fiddle, so I'm a bit confused. I can imagine how you might get it from trying to adapt it, but it would help if you showed the actual code you're running. I can sort of see what you're dong; but why don't you want a new_date at 16:00? And why is the last new_tally 0.91719 instead of 0.575525 - assuming they should add up to 7? – Alex Poole Nov 17 '15 at 17:26
  • Yes,sorry for the misscalculation. It should add up to 7, each entry should have the tally value proportionate to the minutes for that segment. I edited the question with the exact code. – Negashion Nov 17 '15 at 17:46
  • Are you saying you get ORA-01436 from exactly that code in 8i? Might be a red-herring though. Still not sure why you don't have a 16:00 row; without it the gap from 15:00 to 16:23 is going to be the biggest period, is that correct? – Alex Poole Nov 17 '15 at 18:15
  • There only needs to be a row/segment for each hour, be it 1 full hour or just a couple of minutes, so if the interval extends from hour 11 to 16 it would only be 6 entries. This way the new tally can be calculated on an hour basis. – Negashion Nov 18 '15 at 11:08
  • It made more sense when I split the start and end of each period out, rather than merging them into a single `new_date`; that's what was confusing me, as it's sometimes the start of the split period and sometimes the end. (I was also confusing the tally with the number of rows expected for a while!). I got it out as six periods, anyway, and calculated the new split tally. Support for 8i may be an issue though. (Why are you still using such an ancient version?) – Alex Poole Nov 18 '15 at 11:15

1 Answers1

1

I'm not sure how happy 8i will be with this, but you can split the records up into hour (or partial hour) chunks with a hierarchical query like this:

select firstoccurrence, lastoccurrence, tally,
  greatest(firstoccurrence,
    trunc(firstoccurrence, 'HH24') + (level - 1) / 24) as new_start,
  least(lastoccurrence,
    trunc(firstoccurrence, 'HH24') + level / 24) as new_end
from t42
connect by firstoccurrence = prior firstoccurrence
and prior sys_guid() is not null
and trunc(firstoccurrence, 'HH24') + (level - 1) / 24
  <= trunc(lastoccurrence, 'HH24');

... where t42 is a table populated as in your first image. That gives:

FIRSTOCCURRENCE     LASTOCCURRENCE      TALLY NEW_START           NEW_END           
------------------- ------------------- ----- ------------------- -------------------
2014-12-04 11:33:16 2014-12-04 11:33:36     1 2014-12-04 11:33:16 2014-12-04 11:33:36
2014-12-30 11:41:46 2014-12-30 16:23:08     7 2014-12-30 11:41:46 2014-12-30 12:00:00
2014-12-30 11:41:46 2014-12-30 16:23:08     7 2014-12-30 12:00:00 2014-12-30 13:00:00
2014-12-30 11:41:46 2014-12-30 16:23:08     7 2014-12-30 13:00:00 2014-12-30 14:00:00
2014-12-30 11:41:46 2014-12-30 16:23:08     7 2014-12-30 14:00:00 2014-12-30 15:00:00
2014-12-30 11:41:46 2014-12-30 16:23:08     7 2014-12-30 15:00:00 2014-12-30 16:00:00
2014-12-30 11:41:46 2014-12-30 16:23:08     7 2014-12-30 16:00:00 2014-12-30 16:23:08

You can then use that as a CTE, or as I don't think 8i knows about CTEs an inline view, and calculate the difference between each time and it's proportion against the original range, and divide the tally by that:

select firstoccurrence, lastoccurrence, tally, new_start, new_end,
  to_number(to_char(new_start, 'HH24')) as new_hour,
  tally / ((lastoccurrence - firstoccurrence) / (new_end - new_start)) as new_tally
from (
  select firstoccurrence, lastoccurrence, tally,
    greatest(firstoccurrence,
      trunc(firstoccurrence, 'HH24') + (level - 1) / 24) as new_start,
    least(lastoccurrence,
      trunc(firstoccurrence, 'HH24') + level / 24) as new_end
  from t42
  connect by firstoccurrence = prior firstoccurrence
  and prior sys_guid() is not null
  and trunc(firstoccurrence, 'HH24') + (level - 1) / 24
    <= trunc(lastoccurrence, 'HH24')
);

FIRSTOCCURRENCE     LASTOCCURRENCE      TALLY NEW_START           NEW_END             NEW_HOUR   NEW_TALLY
------------------- ------------------- ----- ------------------- ------------------- -------- -----------
2014-12-04 11:33:16 2014-12-04 11:33:36     1 2014-12-04 11:33:16 2014-12-04 11:33:36       11    1.000000
2014-12-30 11:41:46 2014-12-30 16:23:08     7 2014-12-30 11:41:46 2014-12-30 12:00:00       11     .453619
2014-12-30 11:41:46 2014-12-30 16:23:08     7 2014-12-30 12:00:00 2014-12-30 13:00:00       12    1.492714
2014-12-30 11:41:46 2014-12-30 16:23:08     7 2014-12-30 13:00:00 2014-12-30 14:00:00       13    1.492714
2014-12-30 11:41:46 2014-12-30 16:23:08     7 2014-12-30 14:00:00 2014-12-30 15:00:00       14    1.492714
2014-12-30 11:41:46 2014-12-30 16:23:08     7 2014-12-30 15:00:00 2014-12-30 16:00:00       15    1.492714
2014-12-30 11:41:46 2014-12-30 16:23:08     7 2014-12-30 16:00:00 2014-12-30 16:23:08       16     .575524

SQL Fiddle demo; but that's 11gR2 so doesn't mean it'll work in 8i. I don't have such an ancient version to test on unfortunately.

If your table has an ID column then use that for the connect by prior clause - I've had to use firstoccurrence and hope it's unique. You may also need to use dbms_random.value instead of sys_guid() - an non-deterministic function will do though.

Alex Poole
  • 183,384
  • 11
  • 179
  • 318