1

I am trying to handle a database like the one below which has camera trap data. There is no lag between photos being taken so ones that occur in quick succession are likely to be the same individual.

I want to remove duplicates of species if they occur within 10 minutes of eachother as they are likely to be the same individual. (ie if less that 10 minutes elapses between the last photo and the next one). Is there a way to do this in R? Thank you!

  Site.Name Sampling.Unit.Name Photo.Date Photo.Time   Genus Species Number.of.Animals
1 Ranomafana        CT-RNF-1-01 06/10/2010   00:01:00                                  
2 Ranomafana        CT-RNF-1-01 11/10/2010   00:28:00 Eliurus  tanala                 1
3 Ranomafana        CT-RNF-1-01 12/10/2010   04:39:22   Fossa fossana                 1
4 Ranomafana        CT-RNF-1-01 12/10/2010   04:39:27   Fossa fossana                 1
5 Ranomafana        CT-RNF-1-01 12/10/2010   16:47:41 Nesomys   rufus                 1
6 Ranomafana        CT-RNF-1-01 12/10/2010   16:47:46 Nesomys   rufus                 1
  • Welcome to StackOverflow. Perhaps if you made a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) that demonstrates your question / problem, people would find it easier to answer. In your example, use `dput()` to paste your data into the question. – Andrie Aug 02 '12 at 20:49
  • What database are you using? "r" is not a database. If the data is in "r", then the database tag should be removed. – Gordon Linoff Aug 02 '12 at 20:52
  • Apologies, i am new to posting on this forum so still feeling my way with the formatting... – user1572298 Aug 02 '12 at 20:56

1 Answers1

0

Here is your data.frame with just the dates and times

 dat <- data.frame(Photo.Date =
 c("6/10/2010","11/10/2010","12/10/2010","12/10/2010","12/10/2010","12/10/2010"),
 Photo.Time = c("00:01:00","00:28:00","04:39:22","04:39:27","16:47:41","16:47:46"))

Then use strptime (strip time) to convert the dates into POSIX format

date_vec <-strptime(paste(dat$Photo.Date, dat$Photo.Time), "%d/%m/%Y %H:%M:%S")

The next step is to determine the difference in time between each observation. To do this you need to compare observation 1 and 2, 2 and 3, 3 and 4...

first_date <- date_vec[1:(length(date_vec)-1)]
second_date <- date_vec[2:length(date_vec)]
second_gap <- difftime(second_date, first_date, units="mins")

Determine the gaps that are less than 10 minutes apart. You also need to add TRUE to keep the first time.

dup_index <- second_gap>10
dup_index <- c(TRUE, dup_index)
dat[dup_index, ]

Which returns

  Photo.Date Photo.Time
1  6/10/2010   00:01:00
2 11/10/2010   00:28:00
3 12/10/2010   04:39:22
5 12/10/2010   16:47:41

HTH

Jase_
  • 1,186
  • 9
  • 12