map/sample timeseries data to another timeserie db2

Question

I am trying to combine the results of two SQL (DB2 on IBM bluemix) queries:

The first query creates a timeserie from startdate to enddate:

with dummy(minute) as (
      select TIMESTAMP('2017-01-01')
      from SYSIBM.SYSDUMMY1 union all
      select minute + 1 MINUTES
      from dummy
      where minute <= TIMESTAMP('2018-01-01')
     )
select to_char(minute, 'DD.MM.YYYY HH24:MI') AS minute
from dummy;

The result is looking like this

The second query selects data from a table which have a timestamp. This data should be joined to the generated timeseries above. The standalone query is like:

SELECT DISTINCT
  to_char(date_trunc('minute', TIMESTAMP), 'DD.MM.YYYY HH24:MI') AS minute,
  VALUE AS running_ct
FROM TEST
WHERE ID = 'abc'
AND NAME = 'sensor'
ORDER BY minute ASC;

The result is looking like this

What I suppose to get is a query with one result with contains of two columns:

first column with the timestamp from startdate to enddate and
the second with values which are sorted by there own timestamps to the
first column (empty timestamps=null).

How could I do that?

score 0 · Answer 1 · answered Dec 21 '17 at 14:30

found a working solution:

    with dummy(temporaer) as (
     select TIMESTAMP('2017-12-01') from SYSIBM.SYSDUMMY1
     union all
     select temporaer + 1 MINUTES from dummy where temporaer <= TIMESTAMP('2018-01-31'))
    select temporaer, avg(VALUE) as text from dummy
    LEFT OUTER JOIN TEST ON temporaer=date_trunc('minute', TIMESTAMP) and ID='abc' and NAME='text'
    group by temporaer
    ORDER BY temporaer ASC;

cheers

score 0 · Accepted Answer · answered Dec 21 '17 at 23:08

0

A better solution, especially if your detail table is large, is to generate a range. This allows the optimizer to use indices to fulfill the bucketing, instead of calling a function on every row (which is expensive).

So something like this:

WITH dummy(temporaer, rangeEnd) AS (SELECT a, a + 1 MINUTE 
                                    FROM (VALUES(TIMESTAMP('2017-12-01'))) D(a)
                                    UNION ALL
                                    SELECT rangeEnd, rangeEnd + 1 MINUTE
                                    FROM dummy
                                    WHERE rangeEnd < TIMESTAMP('2018-01-31'))
SELECT Dummy.temporaer, AVG(Test.value) AS TEXT
FROM Dummy
LEFT OUTER JOIN Test
             ON Test.timestamp >= Dummy.temporaer
                AND Test.timestamp < Dummy.rangeEnd
                AND Test.id = 'abc'
                AND Test.name = 'text'
GROUP BY Dummy.temporaer
ORDER BY Dummy.temporaer ASC;

Note that the end of the range is now exclusive, not inclusive like you had it before: you were including the very first minute of '2018-01-31', which is probably not what you wanted. Of course, excluding just the last day of a month also strikes me as a little strange - you most likely really want < TIMESTAMP('2018-02-01').

answered Dec 21 '17 at 23:08

Clockwork-Muse

12,806
6
31
45

hey. Great job. Thank you! Your solution works fine! Actually your code is 0.1s slower than mine above but I think this could change if the data in table are lager than at the moment (as you said it). – eid Dec 22 '17 at 08:05
yes your right: what I really want is "< TIMESTAMP('2018-02-01')" but this problem is at least not so difficult to manage :) – eid Dec 22 '17 at 08:07
is it also possible to have a mulitple join on thant? like first join 'text', scond join 'text2', 'text3' and so on? – eid Dec 22 '17 at 11:53
@eid - it's possible that the .1s is just due to normal fluctuations in runtime (especially if that was the first time you ran the statement, when the optimizer first has to prepare the statement and doesn't yet have a cached plan). And yes, you can have multiple joins, but you have an aggregate - if each join has multiple rows you're going to multiply the number of rows together (known as a Cartesian product), which would throw off the average. – Clockwork-Muse Dec 22 '17 at 16:53
thank you Clockwork-Muse! Please have also a look to the new topic from me where you can see the whole workflow I am on (perhaps you have a solution in mind?): https://stackoverflow.com/questions/48058500/complex-db2-sql-query-with-time-sampling-group-map-join-and-csv-export – eid Jan 02 '18 at 09:35

map/sample timeseries data to another timeserie db2

2 Answers2