0

I'm relatively new to R and I am needing to calculate the distance between GPS points. I have achieved this through distanceTrack from the argosfilter package but there are gaps in my data.

There are meant to be recordings every 10 minutes but due to issues in the field there are gaps up to 5 days long. So I need a way of telling R not to calculate the distance if the time between the points is greater than 10 minutes.

The code I have at the moment is very simple as it calculates the distance between the sequence of locations:

lat<-lizard$Lat

lon<-lizard$Lon

distanceTrack(lat,lon)

I thought an if function would work but I have hardly used them and don't know how to write them to use with time. So would this be a suitable solution or are there better ways to do this? Any ideas on how to solve this would be greatly appreciated!

smci
  • 32,567
  • 20
  • 113
  • 146
Ben Westwood
  • 17
  • 1
  • 8
  • 1
    Since you're relatively new to R and StackOverflow, you should start by reading this: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Peter Diakumis Mar 28 '15 at 05:16

1 Answers1

0

One way to do this is to filter the rows from the data that don't match what you want. If I'm correct about the distanceTrack function, it takes N lat / lon points and reduces them to N-1 distances between those points. You then only need to eliminate the points that aren't ten minutes apart, which you can do pretty easily by filtering the dataframe.

Not knowing what exactly your data looks like, I've created some example data below that hopefully resembles it:

## Create a data.frame of 10-minute intervals stored as POSIXct
## You can convert most times to POSIXct using as.POSIXct if they are not already that format
lizard <- data.frame(time=seq(from=as.POSIXct('2015-01-01 00:00:00'), to=as.POSIXct('2015-01-02 00:00:00'), by=10*60))

## Randomly eliminate rows with probability of 15% that a given row is eliminated to create gaps
lizard$keep = runif(nrow(lizard))
lizard <- lizard[lizard$keep <= .85, c('time'), drop=FALSE] ## drop arg used to kepe it a dataframe

## Random lat / lon data:
lizard$Lat = runif(nrow(lizard)) ## runif is random uniform
lizard$Lon = runif(nrow(lizard))

Now I run the distance calculation. We need to run the distance calculation before eliminating the rows, because even if there is a time gap between rows i and j (i.e., j$time - i$time > 10 minutes), we still need the values from row j to calculate the distance traveled between rows j and k, which may themselves be 10 minutes apart:

## We initialize to NA; the distance variable for row i will represent the distance between row i-1 and i; 
## row 1 will not have a meaningful value
lizard$distance <- NA 
lizard$distance[2:nrow(lizard)] <- distanceTrack(lizard$Lat, lizard$Lon)

And finally we can use a boolean to filter the rows by comparing rows i and i-1 for each row 2:N:

lizard$isContiguous <- TRUE ## initialize a variable to determine if the data is at 10-min interval
lizard$isContiguous[2:nrow(lizard)] <- (as.numeric(lizard$time[2:nrow(lizard)] - lizard$time[1:nrow(lizard) - 1]) == 10)
lizard <- lizard[lizard$isContiguous, ] ## filter

The distances left in that dataframe are only the ones for which the time interval was 10 minutes.

For more information on filtering (or more precisely, extracting or replacing), check out the documentation here:

For [-indexing only: i, j, ... can be logical vectors, indicating elements/slices to select

Carson Moore
  • 1,287
  • 1
  • 8
  • 9
  • Hi, this works up until the filter where I get the error message: In Ops.factor(lizard$DateTime[2:nrow(lizard)], lizard$DateTime[1:nrow(lizard) - : ‘-’ not meaningful for factors. Do you know how to fix this? – Ben Westwood Mar 29 '15 at 02:06
  • Without knowing more about your data, my guess is your `DateTime` column is stored as factors (https://stat.ethz.ch/R-manual/R-devel/library/base/html/factor.html). Try casting them to POSIXct with `as.POSIXct`: `lizard$DateTime <- as.POSIXct(lizard$DateTime)` You may also need to look at the `as.POSIXct` documentation if your `DateTime` column is in a different format than YYYY-MM-DD hh:mm:ss – Carson Moore Mar 29 '15 at 15:04