0

I have some data in SQLite files, and want to manipulate it from C#.

The data includes three columns, Time, Temperature and Power. And I want to find the groups of records which have a Temperature higher than 30, and last more than 3 minutes, and calculate the mean of power of each group (continuous time range).

Select continuous ranges from table

How to Select continuous date in sql

I find two answers above, but can't reach the answer of mine.

It can be get the results directly with a SQL string (for SQLiteDataAdapter). or select all data and manipulate it with c# (datatable.Select,Computer, or LINQ).

| No   | Time              | Temp | Power |
| ---- | ----------------- | ---- | ----- |
| 1    | 2019-11-4 0:00:00 | 25   | 1200  |
| 2    | 2019-11-4 0:01:10 | 30   | 1000  |
| 3    | 2019-11-4 0:02:20 | 31   | 680   |
| 4    | 2019-11-4 0:03:30 | 34   | 960   |
| 5    | 2019-11-4 0:04:40 | 29   | 800   |
| 6    | 2019-11-4 0:05:50 | 31   | 600   |
| 7    | 2019-11-4 0:07:00 | 32   | 400   |
| 8    | 2019-11-4 0:08:10 | 33   | 900   |
| 9    | 2019-11-4 0:09:20 | 34   | 1000  |
| 10   | 2019-11-4 0:10:30 | 39   | 200   |
| 11   | 2019-11-4 0:11:40 | 24   | 350   |

Expected results: select the continuous records when Temp more than 30 and continuous time more than 3 minutes as a group, and calculate the mean of Power.

One group in the example table including No from 6 to 10. And the mean is 620.


Thanks to all.The codes used in my works:

        string sql = "WITH CTE AS (" +
                     "SELECT(Temp > 30) AS tempgt30,Time, Power," +
                            "ROW_NUMBER() OVER(ORDER BY No) rn," +
                            "ROW_NUMBER() OVER(PARTITION BY(Temp > 30) ORDER BY No) rn2" +
                     " FROM data" +
                     ")" +
                    " SELECT MIN(rn)AS minrow,MAX(rn) as maxrow," +
                            "strftime('%s', MAX(Time)) - strftime('%s', MIN(Time)) AS duration," +
                            "AVG(Power) AS mean" +
                    " FROM CTE" +
                    " WHERE tempgt30" +
                    " GROUP BY(rn2 - rn)" +
                    " HAVING duration > 180;";

        con.Open();

        if (con.State == ConnectionState.Open)
        {
            using (SQLiteDataAdapter dataAdapter = new SQLiteDataAdapter(sql, con))
            {
                dataAdapter.Fill(datatable);
                con.Close();
            }
        }
Jack Hsueh
  • 115
  • 8

2 Answers2

3

You were looking at the correct questions as this is indeed a gaps and islands problem. In your case the "islands" are the readings where Temp > 30. We can form them and calculate the mean Power and duration of the island (in seconds) using this query:

WITH CTE AS (
SELECT (Temp > 30) AS tempgt30, 
       Time,
       Power,
       ROW_NUMBER() OVER (ORDER BY No) rn,
       ROW_NUMBER() OVER (PARTITION BY (Temp > 30) ORDER BY No) rn2
FROM data
)
SELECT MIN(rn) AS minrow, 
       MAX(rn) as maxrow, 
       MIN(Time) AS start,
       MAX(Time) AS end,
       strftime('%s', MAX(Time)) - strftime('%s', MIN(Time)) AS duration,
       AVG(Power) AS mean
FROM CTE
WHERE tempgt30
GROUP BY (rn2 - rn)
HAVING duration > 180

Output:

minrow  maxrow  start                   end                     duration    mean
6       10      2019-11-04 00:05:50     2019-11-04 00:10:30     280         620

Demo on dbfiddle

Nick
  • 138,499
  • 22
  • 57
  • 95
  • 1
    the problem with this is 3-4 didnt run more than 3 mins – Gabriel Llorico Nov 04 '19 at 04:20
  • `MAX(Time) OVER (PARTITION BY Time AND (Temp > 30) ORDER BY No) maxTime,` is easy enough to add to the `CLE` (along with min).. but I have no idea how to handle DateDiff and comparisons in SqlLite (though I think `SQLiteDataAdapter` is probably an aspect of that` – Brett Caswell Nov 04 '19 at 04:50
  • @GabrielLlorico I have added a check for the duration of the period being longer than 3 minutes. – Nick Nov 04 '19 at 05:04
  • 1
    @BrettCaswell you can use `strftime('%s', date)` to get a unix timestamp – Nick Nov 04 '19 at 05:09
  • thanks.. your solution is quite good here.. is `MIN(Time)` the same as `Time` in the select from `CTE`? I realized after my comment above that `minTime` based on partitioning was not going to work continuously. – Brett Caswell Nov 04 '19 at 05:19
  • @BrettCaswell no - if you were to `SELECT * FROM CTE` it has all the rows from the original table in it. So you need to get `MIN(Time)` while grouping over the island number. – Nick Nov 04 '19 at 05:21
1

I recommand nick's answer.

and below script is sql server version :

you can use temptable + cursor to do it.

logic :

  • select into to create temptable and filter Temp > 30
  • lag function get PreviousNo
  • run corsor to check if PreviousNo = CurrentNo - 1 then put into same group
  • group having time minutes diff > 3 to get data;

CREATE TABLE T
  ([No] int, [Time] datetime, [Temp] int, [Power] int)
;

INSERT INTO T
  ([No], [Time], [Temp], [Power])
VALUES
  (1, '2019-11-04 00:00:00', 25, 1200),
  (2, '2019-11-04 00:01:10', 30, 1000),
  (3, '2019-11-04 00:02:20', 31, 680),
  (4, '2019-11-04 00:03:30', 34, 960),
  (5, '2019-11-04 00:04:40', 29, 800),
  (6, '2019-11-04 00:05:50', 31, 600),
  (7, '2019-11-04 00:07:00', 32, 400),
  (8, '2019-11-04 00:08:10', 33, 900),
  (9, '2019-11-04 00:09:20', 34, 1000),
  (10, '2019-11-04 00:10:30', 39, 200),
  (11, '2019-11-04 00:11:40', 24, 350)
;

GO
11 rows affected
with cte as (
  select 
   LAG(No, 1,0) OVER (ORDER BY No) AS PreviousNo,
   null AS GroupId,
   *
  from T
  where Temp > 30 
)
select * into #T from cte;
GO
7 rows affected
select * from #T
GO
PreviousNo | GroupId | No | Time                | Temp | Power
---------: | ------: | -: | :------------------ | ---: | ----:
         0 |    null |  3 | 04/11/2019 00:02:20 |   31 |   680
         3 |    null |  4 | 04/11/2019 00:03:30 |   34 |   960
         4 |    null |  6 | 04/11/2019 00:05:50 |   31 |   600
         6 |    null |  7 | 04/11/2019 00:07:00 |   32 |   400
         7 |    null |  8 | 04/11/2019 00:08:10 |   33 |   900
         8 |    null |  9 | 04/11/2019 00:09:20 |   34 |  1000
         9 |    null | 10 | 04/11/2019 00:10:30 |   39 |   200
DECLARE @tmp_no int,@groupId int = 1;

DECLARE MY_CURSOR CURSOR 
  LOCAL STATIC READ_ONLY FORWARD_ONLY
FOR 
SELECT No 
FROM #T

OPEN MY_CURSOR
FETCH NEXT FROM MY_CURSOR INTO @tmp_no
WHILE @@FETCH_STATUS = 0
BEGIN 
    if not (( select PreviousNo from #T where No = @tmp_no ) = @tmp_no - 1)  begin
       set @groupId = @groupId + 1;
    end
    update #T set groupId = @groupId where no = @tmp_no;
    FETCH NEXT FROM MY_CURSOR INTO @tmp_no
END
CLOSE MY_CURSOR
DEALLOCATE MY_CURSOR
GO
7 rows affected
select * from #T
GO
PreviousNo | GroupId | No | Time                | Temp | Power
---------: | ------: | -: | :------------------ | ---: | ----:
         0 |       2 |  3 | 04/11/2019 00:02:20 |   31 |   680
         3 |       2 |  4 | 04/11/2019 00:03:30 |   34 |   960
         4 |       3 |  6 | 04/11/2019 00:05:50 |   31 |   600
         6 |       3 |  7 | 04/11/2019 00:07:00 |   32 |   400
         7 |       3 |  8 | 04/11/2019 00:08:10 |   33 |   900
         8 |       3 |  9 | 04/11/2019 00:09:20 |   34 |  1000
         9 |       3 | 10 | 04/11/2019 00:10:30 |   39 |   200
select 
   convert(varchar(3),min(no)) + ' to ' + convert(varchar(3),max(no)) as no_range
   ,avg(power) as avgpower
   ,min(Time) as mintime
   ,max(Time) as maxtime
   , datediff(minute,min(Time),max(Time)) timediff
from #T
group by groupid having datediff(minute,min(Time),max(Time)) > 3
GO
no_range | avgpower | mintime             | maxtime             | timediff
:------- | -------: | :------------------ | :------------------ | -------:
6 to 10  |      620 | 04/11/2019 00:05:50 | 04/11/2019 00:10:30 |        5

db<>fiddle here

Nick
  • 138,499
  • 22
  • 57
  • 95
Wei Lin
  • 3,591
  • 2
  • 20
  • 52