I have a set of CPU logging data that records data at uneven intervals. For each row, I would like to count the number of rows that occured within the last second of that partciular row. Example data in the first two columns, with the expected output in the third column. For the first n rows that occur less than a second from the start of the log, the output is NA.
timestamp (POSXIct) data output
2018-09-19 00:53:48.014469 123 NA
2018-09-19 00:53:48.031590 123 NA
2018-09-19 00:53:48.052569 123 NA
...
... 56 other rows not shown
...
2018-09-19 00:53:48.015465 123 60 --> first row that is >=1 sec from the start of the file
2018-09-19 00:53:48.017463 123 61 --> 61 rows within 1 sec from this time including this row
2018-09-19 00:53:48.018862 123 62 --> 62 rows within 1 sec from this time
2018-09-19 00:53:48.024468 123 62
2018-09-19 00:53:48.031869 123 61
2018-09-19 00:53:48.081869 123 50 --> 50 rows within 1 sec from this time
At the moment I am using a straight forward for
loop, but the time is excessive for a reasonable amount of data. I've looked at trying to use a mix of floors, cumulative counts, findInterval, summarize, etc. but I could not see a way that works given the uneven interval period. Any ideas on a speedier implmentation?