1

I have a dataset of water quality data with Date, Result, Parameter, Station in R. I am trying to extract the first and last date that a sample was taken PER station which would take up two new columns at the end of my dataframe. I also have columns for Month, Day, and Year.

Here is the structure:

'data.frame':   50954 obs. of  8 variables:
$ Date     : chr  "6/9/2016" "6/9/2016" "6/8/2016" "6/8/2016" ...
$ Result   : num  400 160 2200 260 660 550 2100 270 750 82 ...
$ Units    : chr  "M" "M" "M" "M" ...
"Fecal coliforms" ...
$ Station  : chr  "RIO GRANDE DE MANATI AT HWY 2 NR MANATI, PR" "RIO GRANDEE DE MANATI AT HWY 2 NR MANATI, PR" "RIO CAONILLAS NR JAYUYA, PR"
"RIO CAONILLAS NR JAYUYA, PR" ...
$ month    : num  6 6 6 6 6 6 6 6 6 6 ...
$ year     : num  2016 2016 2016 2016 2016 ...
$ day      : num  9 9 8 8 8 8 7 7 7 7 ...

I've been doing this to extract summary statistics by station:

P303.split <- split(P303, Parameter)
Copper = P303.split$'Copper'
CopperSumStats = data.frame(do.call("rbind", by(Copper[, "Result"],  Copper[,"Station"], summary)))

So now just need start and end dates... Thanks in advance!

kslayerr
  • 819
  • 1
  • 11
  • 21
  • 2
    You should try to make your example [reproducible](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Could you provide some data, possibly using `dput`? – bouncyball Aug 15 '16 at 15:45

1 Answers1

2

I think you could use dplyr to perform the calculations you need:

library(dplyr) #load package
df1$Date <- as.Date(df1$Date, format = "%d/%m/%Y") #format date
#data manipulation
df1 %>%
group_by(Station) %>%
mutate(FirstDate = min(Date), LastDate = max(Date)) -> df2

This solution assumes your data is in a data.frame named df1.

bouncyball
  • 10,631
  • 19
  • 31
  • Thanks for your help! I seem to be getting NAs in my FirstDate and Last Date columns though – kslayerr Aug 15 '16 at 16:42
  • @Kelsey did you reformat your date column? Are there NA values for Date? If so, you will need to specify `na.rm = T` in the `min` and `max` functions. – bouncyball Aug 15 '16 at 16:53