0

I am trying to use the "ddply" function to calculate the difference in eery two consecutive rows representing two distinct years. The data set looks like this

year ID value
1 2005  A    10
2 2015  A    20
3 2005  B    25
4 2015  B     5
5 2005  C    10
6 2015  C    15

I am using the function as follows

ddply(df, "ID",  function(x) (x[2,] - x[1,]))

However, it seems I am making an error in my function as the difference is calculated for the variables including the non-numeric ones. I get the following result.

year ID value
1   10 NA    10
2   10 NA   -20
3   10 NA     5

I know the solution might be quite straight forward. I would wish to get the following summarised results.

 ID  change
 A    10
 B    -5 
 C     5

Does anyone know how to achieve this using "ddply" or any other function?

slyn
  • 35
  • 2

1 Answers1

0

multiple options here, but I'm not sure about your desires output.. it seems to contain an error? B should be -20?

sample data

library( data.table)
data <- fread("year ID value
2005  A    10
2015  A    20
2005  B    25
2015  B     5
2005  C    10
2015  C    15", header = TRUE, stringsAsFactor = FALSE)

dt <- data
df <- as.data.frame( data )

data.table

library(data.table)
dt[, list( delta = value[year == 2015] - value[year == 2015] ), by = .(ID)][]
#    ID delta
# 1:  A    10
# 2:  B   -20
# 3:  C     5

dplyr

library( dplyr )
df %>% group_by( ID ) %>% summarise( delta = value[year == 2015] - value[year == 2005])
# A tibble: 3 x 2
#   ID    delta
#   <chr> <int>
# 1 A        10
# 2 B       -20
# 3 C         5

In both methods, you can replace value[year == 2015] - value[year == 2015] by value[2] - value[1], but only if you are SURE your data is already in the correct order!

Wimpel
  • 26,031
  • 1
  • 20
  • 37
  • Thanks Wimpel for the quick response. Yes it was mean't to be -20 just an illustration). I still have a question on this. Do you know how if I can have the final output as a data frame including also the IDs at the moment I just get the delta – slyn Jan 15 '19 at 13:23
  • @Snyawira just pass it to a variable using `newvar <- .....` – Wimpel Jan 15 '19 at 13:44