I am working with jail data and am looking to find the number of people in custody on any given day. The data is formatted in the chart below, where each row represents the location of an inmate. One booking ID is associated with one inmate and the release date is the inmates release date completed out of the jail/prison system. So taking booking ID 123 as an example, the inmate was initially jailed on 1/1/2022, moved to a prison on 2/4, was transferred back to jail on 3/6, and back to prison on 4/4, and then completely released on 6/2.
*There are some cases where the release date is null, indicating they haven't been released yet.
Booking_ID | Type | Date | Release Date |
---|---|---|---|
123 | Jail | 1/1/2022 | 6/2/2022 |
123 | Prison | 2/4/2022 | 6/2/2022 |
123 | Jail | 3/6/2022 | 6/2/2022 |
123 | Prison | 4/4/2022 | 6/2/2022 |
456 | Jail | 1/1/2022 | 6/2/2022 |
456 | Prison | 2/4/2022 | 6/2/2022 |
What I am trying to accomplish is on any given day, calculate the number of people by type on any given day. The output should essentially be a table with every day possible from the earliest date available, the type of jail/prison, and the number of inmates. So on 1/30/2022, there would be 2 people in jail. On 2/20/2022, there would be 2 people in prison. I hope this makes sense. I'm thinking I can do something like this as a start, but besides that I'm stumped:
from UNNEST(
GENERATE_DATE_ARRAY(
(select min(date) from base),
current_date(),
INTERVAL 1 DAY
)
) as dt -- one row per date in the range
The output would look something like this, with the latest output up to the current date:
Date | Type | Count |
---|---|---|
1/30/2022 | Jail | 2 |
2/20/2022 | Prison | 2 |
7/7/2022 | Jail | 0 |
7/7/2022 | Prison | 0 |