0

I have the following data frame in R called df :

id<-c(1,1,1,1,2,3,3,3,3)
day<-c(1,2,4,5,2,2,3,6,8)
payment<-c(5,10,3,30,23,40,20,10,50)
df<-data.frame(id,day,payment)

 id  day  payment
  1   1       5
  1   2      10
  1   4       3
  1   5      30
  2   2      23
  3   2      40
  3   3      20
  3   6      10
  3   8      50

what I'm trying to do is creating a new variable called soFarMax, it represents the maximum payment that the associated id has made until that day:

  id  day   payment SoFarMax
  1   1       5       5
  1   2      10       10
  1   4       3       10
  1   5      30       30
  2   2      23       23
  3   2      40       40
  3   3      20       40
  3   6      10       40
  3   8      50       50

Would appreciate your help with this.

AliCivil
  • 2,003
  • 6
  • 28
  • 43
  • 1
    See `?cummax`.. – Frank Feb 15 '17 at 04:03
  • 1
    Thanks @Frank wasn't aware of it! – AliCivil Feb 15 '17 at 04:13
  • 1
    With base R if you are interested `ave(df$payment, df$id, FUN = cummax)` – Ronak Shah Feb 15 '17 at 04:24
  • @RonakShah Could you elaborate? There's also http://stackoverflow.com/questions/34069496/finding-running-maximum-by-group-in-r though akrun duped it against the current one. – Frank Feb 15 '17 at 05:08
  • Ok, thanks @Ronak . Here's another: http://stackoverflow.com/q/34069496/ I edited the target, will undupe, someone else can pick a more appropriate dupe, maybe from my links here. – Frank Feb 15 '17 at 05:12

4 Answers4

1
SoFarMax <- unlist(tapply(df[,3], df[,1], cummax))

if your order the days before it is not necessary that there already in your dataframe in order:

df_order <- df[order(df[,2]),]
SoFarMax <- unlist(tapply(df_order[,3], df_order[,1], cummax))
and-bri
  • 1,563
  • 2
  • 19
  • 34
0

The key apparently was cummax thanks to @Frank. Here is what I managed to come up with:

library(data.table)
df<-data.table(df)
df<-df[,MaxSoFar:=cummax(payment),by=list(id)]
AliCivil
  • 2,003
  • 6
  • 28
  • 43
0

Using dplyr

df %>%
group_by(id) %>%
mutate(SoFarMax = cummax(payment))
Karthik Arumugham
  • 1,300
  • 1
  • 11
  • 18
0

I think you need to use window functions to group by this with IDs ( Use dplyr or subset for that ) and then use cummax(x) or you can use logic

df$sofarmax<-ifelse(df$payment[i]>df$payment[i-1],df$payment[i],df$payment[i-1])
Sunil Garg
  • 14,608
  • 25
  • 132
  • 189