I have a data.table
indicating the position of an animal which looks something like this:
Date TimeStamp Transponder Units
1: 2021-08-15 2021-08-15 14:11:13 DA2C614E M2
2: 2021-08-15 2021-08-15 14:11:14 DA2C614E M2
3: 2021-08-15 2021-08-15 14:11:14 DA2C614E M2
4: 2021-08-15 2021-08-15 14:11:15 DA2C614E M2
5: 2021-08-15 2021-08-15 14:11:15 DA2C614E M2
6: 2021-08-15 2021-08-15 14:11:16 DA2C614E M2
7: 2021-08-15 2021-08-15 14:12:40 DA2C614E HM2
8: 2021-08-15 2021-08-15 14:12:40 DA2C614E HM2
9: 2021-08-15 2021-08-15 14:12:49 DA2C614E H2
10: 2021-08-15 2021-08-15 14:18:02 DA2C614E H1
11: 2021-08-15 2021-08-15 14:18:04 DA2C614E H1
12: 2021-08-15 2021-08-15 14:19:19 DA2C614E H1
13: 2021-08-15 2021-08-15 14:25:29 DA2C614E HM2
The Transponder indicates the individual and the Units the position. I would like to create a more sparse version of this data set by deleting the duplicated rows, but only those which are in succession. Essentially, because the successive duplicates just mean no movement. Meaning I would want my end results to be:
Date TimeStamp Transponder Units
1: 2021-08-15 2021-08-15 14:11:13 DA2C614E M2
2: 2021-08-15 2021-08-15 14:12:40 DA2C614E HM2
3: 2021-08-15 2021-08-15 14:12:49 DA2C614E H2
4: 2021-08-15 2021-08-15 14:18:02 DA2C614E H1
5: 2021-08-15 2021-08-15 14:25:29 DA2C614E HM2
I have tried working with duplicated()
, but the problem is that this function deletes all duplicates, like for example the later repeated HM2
entry. (I ran duplicated on a subset of Date, Transponder and Units):
> sample[!duplicated(sample[, c(1,3,4)]),]
Date TimeStamp Transponder Units
1: 2021-08-15 2021-08-15 14:11:13 DA2C614E M2
2: 2021-08-15 2021-08-15 14:12:40 DA2C614E HM2
3: 2021-08-15 2021-08-15 14:12:49 DA2C614E H2
4: 2021-08-15 2021-08-15 14:18:02 DA2C614E H1
Any ideas on how to solve this "elegantly", i.e. without having to loop through this?