0

Suppose we have the following data set:

library(data.table)

t <- data.table(id = c(1,3,1,3,4,1,3, 4),
                year = c(2017, 2017, 2018, 2018, 2018, 2019, 2019, 2019),
                value = c(1,1,3,2,4,5,6, 9))

I would like to calculate (without transforming it into a wide data set) the difference between every year in a column. It can be safely assumed that every year is present, but not every id is present in every year.

The result should look like a column of: NA, NA, 2, 1, NA, 2, 4, 5

(the value in the current year minus the value in the previous year)

How would I go around to perform this calculation, as it seems that the standard assignment of variables in data.table does not allow for it?

t[, diff := ???]
pogibas
  • 27,303
  • 19
  • 84
  • 117
Snowflake
  • 2,869
  • 3
  • 22
  • 44

2 Answers2

1

Use shift function from a data.table package.

# Should work with given OP's data (t)
data[, difference := value - shift(value), id]

PS:

  • Don't use t as an object (it's a base R function)
  • Don't use diff as a column name (it's a base R function)
pogibas
  • 27,303
  • 19
  • 84
  • 117
0

Or use diff to take the difference of adjacent elements by 'id'

t[,  difference := c(NA, diff(value)), id]
akrun
  • 874,273
  • 37
  • 540
  • 662