R how to calculate 'Percentage of Change' between two rows?

Question

I like to compute 'Percentage of change between two rows' while Item changes in rows.

For first row of items, I am ok to place some text or blank (example ‘skip’ or blank is ok) because of no comparison of previous row for same item

df <- data.frame(period=rep(1:4,2), 
             farm=c(rep('A',4),rep('B',4)), 
             cumVol=c(1,5,15,31,10,12,16,24), item=c(rep('Z',3),rep('Y',3),rep('X',2)),
             other = 1:8);

My desired out put of 'Percentage_Change' column (additional column to dataset) would be like below. (Formula: 100/Previous Item row value * current row Item value...Example for 2nd row..100/1 * 5 = 500)

Row Percentage_Change
1   Skip
2   500
3   300
4   Skip
5   32.25806452
6   120
7   Skip
8   150

Thanks for all your help.

5th Value is wrong. It suppose to be 32.25. Seems like it calculating only above 100%. — Murali, Mar 20 '16 at 07:00
5th value is ....100/31 (4th row value) * 10 (5th row value) = 32.25 — Murali, Mar 20 '16 at 07:02

score 5 · Answer 1 · edited May 23 '17 at 12:00

The same can be done with dplyr. This is completely analogous to akrun's answer, but uses other syntax.

library(dplyr)
mutate(df, Row = 1:n()) %>%
  group_by(item) %>%
  mutate(Percentage_Change = cumVol/lag(cumVol) * 100) %>%
  ungroup %>%
  select(Row, Percentage_Change)
## Source: local data frame [8 x 2]
## 
##     Row Percentage_Change
##   (int)             (dbl)
## 1     1                NA
## 2     2         500.00000
## 3     3         300.00000
## 4     4                NA
## 5     5          32.25806
## 6     6         120.00000
## 7     7                NA
## 8     8         150.00000

akrun · Accepted Answer · 2016-03-20T07:04:39.080

2

We can use shift from data.table. Convert the 'data.frame' to 'data.table' (setDt(df)), grouped by 'item', we get the "Row" from .I and create the "Percentage_Change" by dividing the "cumVol" by the lag of "cumVol" (got by shift) and multiplying by 100. If needed the grouping column can be removed by assigning (:=) it to NULL.

library(data.table)
setDT(df)[, list(Row = .I, Percentage_Change=round(cumVol*
   (100/shift(cumVol)),2)), .(item)][, item := list(NULL)][]
#   Row Percentage_Change
#1:   1               NA
#2:   2           500.00
#3:   3           300.00
#4:   4               NA
#5:   5            32.26
#6:   6           120.00
#7:   7               NA
#8:   8           150.00

NOTE: This gives NA for elements where there are no comparison, instead of the Skip.

edited Mar 20 '16 at 07:04

answered Mar 20 '16 at 06:53

akrun

874,273
37
540
662

@Akrun..can you help me in 5th value please? I did overlooked that value. – Murali Mar 20 '16 at 07:04
1

@Murali It is changed. Please check – akrun Mar 20 '16 at 07:04
1

I just tested the code on my big dataset and worded perfect for less than 100 values. Great job and I love you :-) – Murali Mar 20 '16 at 07:11
How could you apply this to categories -1 and 1 for `% change <= .8` and `% change >= 1.2` and also to look back at values for the past hour instead of just the last row – zsad512 Sep 07 '17 at 00:55
@zsad512 Please post as a new question as your comment is not clear to me – akrun Sep 07 '17 at 04:52
@akrun https://stackoverflow.com/questions/46084728/breaking-a-continuous-variable-into-categories-using-dplyr-and-or-cut/46086107?noredirect=1#comment79135217_46086107 – zsad512 Sep 07 '17 at 13:20

R how to calculate 'Percentage of Change' between two rows?

2 Answers2

Linked