I'm querying a table that contains state transitions for a state engine. The table is set up so that it has the previous_state
, current_state
, and timestamp
of the transition, grouped by unique id
s.
My goal is to find a sequence of target intervals, defined as timestamp of the initial state transition (eg timestamp when we shift from from 1->2), and timestamp of the target next state transition that matches a specific condition (eg the next timestamp that current_state=3 OR current_state=4).
state_transition_table
+------------+---------------+-----------+----+
| prev_state | current_state | timestamp | id |
+------------+---------------+-----------+----+
| 1 | 2 | 4.5 | 1 |
| 2 | 3 | 5.2 | 1 |
| 3 | 1 | 5.4 | 1 |
| 1 | 2 | 10.3 | 1 |
| 2 | 5 | 10.4 | 1 |
| 5 | 4 | 10.8 | 1 |
| 4 | 1 | 11.0 | 1 |
| 1 | 2 | 12.3 | 1 |
| 2 | 3 | 13.5 | 1 |
| 3 | 1 | 13.6 | 1 |
+------------+---------------+-----------+----+
Within a given id, we want to find all intervals that start with 1->2 (easy enough query), and end with either state 3 or 4. 1->2->anything->3 or 4
An example output table given the input above would have the three states and the timestamps for when we transition between the states:
target output
+------------+---------------+------------+-----------+-----------+
| prev_state | current_state | end_state | curr_time | end_time |
+------------+---------------+------------+-----------+-----------+
| 1 | 2 | 3 | 4.5 | 5.2 |
| 1 | 2 | 4 | 10.3 | 10.8 |
| 1 | 2 | 3 | 12.3 | 13.5 |
+------------+---------------+------------+-----------+-----------+
The best query I could come up with is using window functions in a sub-table, and then creating the new columns from that table. But this solution only finds the next row following the initial transition, and doesnt allow other states to occur between then and when our target state arrives.
WITH state_transitions as (
SELECT
id
previous_state, current_state,
LEAD(current_state) OVER ( PARTITION BY id ORDER BY timestamp) AS end_state,
timestamp as curr_time,
LEAD(timestamp) OVER ( PARTITION BY id ORDER BY timestamp) AS end_time
FROM
state_transition_table
SELECT
previous_state,
current_state,
end_state,
curr_time,
end_time
FROM state_transitions
WHERE previous_state=1 and current_state=2
ORDER BY curr_time
This query would incorrectly give the second output row end_state==5
, which is not what I am looking for.
How can one search a table for the next row that matches my target condition, eg end_state=3 OR end_state=4
?