I have a set of daily time series data for several years (20-70 years). The dates are in the format dd/mm/yyyy in one column, and daily flow values in another. I intend to sort and extract the maximum flow for each year in R.
Asked
Active
Viewed 121 times
-5
-
Welcome to SO. Sometimes without a piece of code and/or a desired output is very hard to try to guess what you want to achieve and help you by providing a solution (That's why the down votes). Read [this](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) post about how to make a reproducible example. – SabDeM Jul 14 '15 at 09:04
1 Answers
1
Try one of the aggregating functions.
aggregate(flow~cbind(year=substr(year,7,11)), df1, FUN=max)
# year flow
#1 2001 23
#2 2002 26
Or
library(data.table)
setDT(df1)[, list(flow= max(flow)) ,.(Year=substr(year, 7, 11))]
# Year flow
#1: 2001 23
#2: 2002 26
Another option is converting to 'Date' class and then extract the 'year' part.
library(lubridate)
setDT(df1)[, list(flow=max(flow)), .(Year= year(dmy(year)))]
data
set.seed(24)
df1 <- data.frame(year= c('26/05/2001', '27/05/2001', '02/01/2002',
'03/01/2002'), flow= sample(20:30,4, replace=FALSE), stringsAsFactors=FALSE)

akrun
- 874,273
- 37
- 540
- 662
-
Tried all but not working keep getting errors. let me clearly spell out the problem. lets call the data set "df1" and it contains two columns 1st "year" containing the dates defined as dd/mm/yyyy, and the second "flow". can you kindly re-write the script defining these parameters as expressed here. you can test if it working also. thanks – iguniwari Ekeu-wei Jul 14 '15 at 09:33
-
-
1It worked a last. had to do a few re-formatting. Thanks alot – iguniwari Ekeu-wei Jul 14 '15 at 09:46
-
-
Hello, u used the code line. library(lubridate) setDT(df1)[, list(flow=max(flow)), .(Year= year(dmy(year)))] but it returned year and maximum flow. however, i want the year returned to reflect the day/month/year and not year only – iguniwari Ekeu-wei Jul 21 '15 at 11:04
-
@iguniwariEkeu-wei Try `setDT(df1)[, year1:=year(dmy(year))][, .SD[which.max(flow)], year1]` It should get the year and day/month/year – akrun Jul 21 '15 at 11:37