0

I want to extract the first and last row of data, within each group, in a data frame in R. I have a long list of data (~300,000 observations) with a couple of thousand groups. For each group, I want the first and last observation (In this case I am extracting the first and last latitude/longitude for a couple of thousand survey transects).

I came up with a for-loop solution that may work: I subset the data one group at a time, but wanted to see if there were cleaner ways to go about this problem:

library(tidyverse) 


#example survey data along CA coastline

example.data = data.frame(group = c(rep('A',20),rep('B',20),rep('C',20)),
                   latitude = seq(32,38, length.out = 60),  #N samples, mean, sd
                   longtitude = seq(-119,-122,length.out = 60)) 

head(example.data)

This looks like:

group latitude longtitude
    A 32.00000  -119.0000
    A 32.10169  -119.0508
    A 32.20339  -119.1017
    A 32.30508  -119.1525
    A 32.40678  -119.2034

Here was my solution using for-loops:

#find groups (i.e. transects)
letter.levels = levels(example.data$group)

first_last = c()

for(i in 1:length(letter.levels)){
  d = filter(example.data, group == letter.levels[i])
  d.len = length(d[,1])
  first = d[1,]
  last = d[d.len,]

  first_last = rbind(first,last,first_last)
}

#view results
first_last

The final results I'm looking for would be this (Start/stop locations for each survey transect):

group latitude longtitude
    C  36.0678  -121.0339
    C  38.0000  -122.0000
    B  34.0339  -120.0169
    B  35.9661  -120.9831
    A  32.0000  -119.0000
    A  33.9322  -119.9661

Could there be a cleaner dplyr version of this that I missed? If nothing else, I can always fall back on this for-loop version.

I searched for help and found: somewhat related question and another(but different) for-loop suggestion

NM_
  • 1,887
  • 3
  • 12
  • 27
Kodiakflds
  • 603
  • 1
  • 4
  • 15

0 Answers0