(PostgreSQL 9.3) I have a table "events" with millions of complex events, stored as received by a device. For example purposes:
+-----------+-------+
| Timestamp | Event |
+-----------+-------+
| 1 | A |
| 2 | A |
| 2 | B |
| 3 | B |
| 10 | A |
| 11 | A |
| 11 | 0 |
| 11 | C |
| 12 | A |
+-----------+-------+
In this case I have four different kinds of events: A, B, C and 0. What I want to do is index them such that I can have start/stop timestamps for each event. The stop conditions are: Event is no longer being reported at a given timestamp OR a "0" even came in, clearing all of them. Final output:
+------+----+-------+
| From | To | Event |
+------+----+-------+
| 1 | 3 | A |
| 2 | 10 | B |
| 10 | 11 | A |
| 11 | 11 | C |
| 12 | | A |
+------+----+-------+
I this case, A was raised at 1, and cleared at 3 because it was no longer being reported at that moment. B was raised at 2, and cleared at 10 for similar reason. A was raised again at 10 and cleared at 11 with the 0 event (despite being reported at that time too!). C was raised at 11 AND cleared at the same time (some ordering will need to be done to handle 0 at same timestamp). Lastly, A was raised again at 12 and is currently active so it gets a NULL end timestamp.
I do have something that works but it is CTE-heavy and as such, doesn't scale well for millions of records. I have been experimenting with LATERAL (with great results) and I am open to any 9.3-specific recommendations. Also the "event" itself has greatly been simplified for this question, in fact it is a complex group of columns. It's possible Window-functions could apply here too.