I have timeseries data that has up to millisecond accuracy. Some of these timestamps can coincide on the exact time which can therefore be sorted out by a database id column to figure out which is the latest.
I am trying to use Timescale to get the latest values per second. Here is an example of the data I'm looking at
time db_id value
2020-01-01 08:39:23.293 | 4460 | 136.01 |
2020-01-01 08:39:23.393 | 4461 | 197.95 |
2020-01-01 08:40:38.973 | 4462 | 57.95 |
2020-01-01 08:43:01.223 | 4463 | 156 |
2020-01-01 08:43:26.577 | 4464 | 253.43 |
2020-01-01 08:43:26.577 | 4465 | 53.68 |
2020-01-01 08:43:26.577 | 4466 | 160.00 |
When obtaining latest price per second, my results should look like this
time value
2020-01-01 08:39:23 | 197.95 |
2020-01-01 08:39:24 | 197.95 |
.
.
.
2020-01-01 08:40:37 | 197.95 |
2020-01-01 08:40:38 | 57.95 |
2020-01-01 08:40:39 | 57.95 |
.
.
.
2020-01-01 08:43:25 | 57.95 |
2020-01-01 08:43:26 | 160.00 |
2020-01-01 08:43:27 | 160.00 |
.
.
.
I've successfully obtained the latest results per second using the Timescale time_bucket
SELECT last(value, db_id), time_bucket('1 seconds', time) AS per_second FROM timeseries GROUP BY per_second ORDER BY per_second DESC;
but it leaves holes in the time column.
time value
2020-01-01 08:39:23 | 197.95 |
2020-01-01 08:40:38 | 57.95 |
2020-01-01 08:43:26 | 160.00 |
The solution I thought up of is creating a database with per second timestamps and null values, migrating data from the previous resulting table and then replacing the null values with last occurring value but it seems like a lot of intermediary steps.
I'd like to know if there is a better approach to this issue of finding the "latest value" per second, minute, hour etc. I originally tried solving the issue with python as it seemed like a simple issue but it took up a lot of computing time.