I have a table as following:
id name amount year
001 A 10 2010
001 A 10 2011
001 A 12 2012
-----------------------
002 A 3 2012
002 A 4 2013
-----------------------
003 B 20 2011
003 B 20 2012
(Note two entities have the same name A
but they are different, id
is the unique identifier.)
I want to compute the increment in amount
over the previous year, i.e. the result should look like:
id name increment year
001 A 0 2010
001 A 0 2011
001 A 2 2012
----------------------------
002 A 0 2012
002 A 1 2013
----------------------------
003 B 0 2011
003 B 0 2012
Note that the increment of the first year is considered "0".
In MSSQL, it can implemented by:
SELECT id,
name,
amount - LAG(amount, 1, amount) OVER (PARTITION BY id ORDER BY YEAR) as increment,
year
FROM table
I am trying to accomplish the task in R with data.table. I found an succinct example here:
DT[, increment := amount - shift(amount, 1), by=id]
. But error was prompted: could not find function "shift"
.
The versions are:
- R: 3.2.0_1
- data.table: 1.9.4
The questions are:
- I found the
shift
function implemented on data.table's Github, why I failed to invoke the function? - I think that
by
in data.table is equivalent toPARTITION BY
in SQL, then what is the counterpart ofORDER BY
in R? Do I have to set the key of data.table before carrying out any aggregation so the data.table is ordered?