I am having some problems dealing with a database... I would appreciate a little help.
I want to count how many times an observation (e.g. registration car number) appears within a range of time (e.g. 90 days) in a database. For this, I have a numeric variable of the car number and the date in this format dd/mm/yyyy (the extent of the time series is clearly above 90 days). My intuition is about creating a new variable (mutate) that counts observations filtering by the variable date, but I don’t know how to introduce the time restriction. Any idea?
Let me show an example:
database
id date_dd/mm/yyyy
1 01/01/2021
1 01/02/2021
1 02/02/2021
1 03/02/2021
1 30/12/2021
2 05/07/2021
2 03/03/2021
2 04/12/2021
2 07/07/2021
12 01/05/2021
8 06/07/2021
My main goal is to delete the observations that are repeated more than three times in 90 days. In conclusion, the resulting database would be:
id date_dd/mm/yyyy
1 01/01/2021
1 01/02/2021
1 02/02/2021
1 30/12/2021
2 05/07/2021
2 03/03/2021
2 04/12/2021
2 07/07/2021
12 01/05/2021
8 06/07/2021
As I have explained above my idea has been to create a new variable that counts the number of times each id number is repeated in 90 days and then filter the data by deleting the observations that scores above 3 in the new variable but I don't have any idea of how to create this new variable with this time restriction.
Thanks in advance