0

I want to use the ave solution from this question: subtract value from previous row by group with two groups.

#Reproduceable example db:
FU <- c(5,10,20,2,7,20,6,14,8,20,15,17)
Studynr <- c(1,1,2,2,3,3,4,4,4,5,6,6)
Fugroup <- c(1,0,0,1,0,1,1,0,2,0,1,0)
db <- data.frame(Studynr,Fugroup,FU) 

# code to calculate the difference consequetively
db$FUdiff <- ave(db$FU, db$Studynr, FUN=function(x) c(NA,diff(x)))

This is the head of the table this code provides:

FU Studynr Fugroup FUdiff
5 1 1 NA
10 1 0 5
20 2 0 NA
2 2 1 -18
7 3 0 NA
20 3 1 13

But what I want is (switching 5 and NA, because FU group is ordered as 1 to 0)

FU Studynr Fugroup FUdiff
5 1 1 5
10 1 0 NA
20 2 0 NA
2 2 1 -18
7 3 0 NA
20 3 1 13

I think the ave code is not working for me as the data per group isn't consecutively sorted in the database. Therefore, instead of using one grouping variable (db$Studynr), I would like to use a second grouping variable (db$Fugroup). Any ideas? Thanks!

(adjusted post)

Kim
  • 80
  • 10
  • Please `dput` your example data and show your desired result. – Peace Wang May 04 '21 at 11:26
  • "Unfortunately this code is not working for me as the data per group isn't consecutively sorted in the database." What's your real inconsecutive data here? – Peace Wang May 04 '21 at 11:48
  • How do you get diff as 0.61 and 1.64 ? What is the calculation? – Ronak Shah May 04 '21 at 11:56
  • Sorry if it's confusing, I want to difference to be calculated consecutively so: The first date (15:32) has value 23.11, the second (15:51) 21.01, so that 2.10 difference. The third date (16:02) has 22.65, so that's 1.64 difference from the second date. The difference in my own data is that I have a categorical variable (instead of time as in the example). I am currently making better example data! – Kim May 04 '21 at 12:04
  • Just changed the post, so it only shows example data (instead of the data from a previous post) – Kim May 04 '21 at 12:37

1 Answers1

1

You can order the data first and apply the ave code :

db <- db[with(db, order(Studynr, Fugroup)), ]
db$FUdiff <- ave(db$FU, db$Studynr, FUN=function(x) c(NA,diff(x)))

You can implement the same logic in dplyr and data.table :

#dplyr
library(dplyr)

db %>%
  arrange(Studynr, Fugroup) %>%
  group_by(Studynr) %>%
  mutate(FUdiff = c(NA, diff(FU))) %>%
  ungroup -> db

#data.table
library(data.table)
setDT(db)[order(Studynr, Fugroup), FUdiff := c(NA, diff(FU)), Studynr]
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213