5

Suppose I have this table from the result of a select * from journeys:

timestamp     | inJourney (1 = true and 0 = false)
--------------------------------------------------
time1         | 1
time2         | 1
time3         | 1
time4         | 0
time5         | 0
time6         | 1
time7         | 1
time8         | 1

Expected:

timestamp     | inJourney (1 = true and 0 = false)
--------------------------------------------------
time1         | 1
time4         | 0
time8         | 1

Note: the timestamp is not important because I only want to count the number of journeys.

Any idea what I have to do?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
juan_marti
  • 433
  • 1
  • 3
  • 17

2 Answers2

9

This is a gaps-and-islands problem. Use the difference of row_number():

select injourney, min(timestamp), max(timestamp)
from (select t.*,
             row_number() over (order by timestamp) as seqnum,
             row_number() over (partition by injourney, order by timestamp) as seqnum_i
      from t
     ) t
group by injourney, (seqnum - seqnum_i)
order by min(timestamp);
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thank you so much. I wasn't aware of this problem (gaps-and-islands problem) but now I know. I've been in this for hours. You deserve a fantastic weekend. @GordonLinoff – juan_marti Apr 12 '19 at 14:55
2

This is a gaps-and-islands problem, you can try to use ROW_NUMBER window function to get the gap from result set then use MIN

You can try this.

Query #1

SELECT MIN(timestamp),inJourney 
FROM (
SELECT *,
    ROW_NUMBER() OVER(ORDER BY timestamp)  - ROW_NUMBER() OVER(PARTITION BY inJourney ORDER BY timestamp) grp
  FROM journeys
) t1
GROUP BY grp,inJourney 
ORDER BY MIN(timestamp);

| min   | injourney |
| ----- | --------- |
| time1 | 1         |
| time4 | 0         |
| time6 | 1         |

View on DB Fiddle

D-Shih
  • 44,943
  • 6
  • 31
  • 51