Referring to previous row in calculation

Question

I'm new to R and can't seem to get to grips with how to call a previous value of "self", in this case previous "b" b[-1].

b <- ( ( 1 / 14 ) * MyData$High + (( 13 / 14 )*b[-1]))

Obviously I need a NA somewhere in there for the first calculation, but I just couldn't figure this out on my own.

Adding example of what the sought after result should be (A=MyData$High):

  A  b
1 5  NA
2 10 0.7142...
3 15 3.0393...
4 20 4.6079...

Welcome to SO, this is a pretty good first question +1, however it is not necessary to put polite thanks in the question here, though it seems nice it tends to distract from the question and it is better to show appreciation by upvoting or accepting answers. — Vality, Jan 06 '16 at 01:19
You may want to look up and try lag() from dplyr. Without knowing the data and your desired output, it is very hard to help with an answer. — Gopala, Jan 06 '16 at 01:19
Please provide a minimal reproducible example. This will make your question a "pretty good" first question. — , Jan 06 '16 at 01:26
@Vality - I don't find sign offs like these distracting (particularly at the end of the post. It's a different story if there's clutter at the beginning that sends the key post content "below the fold"). I personally don't think there's much value in edits that simply remove a polite, concluding pleasantries, but SO has [mixed emotions about them](http://meta.stackexchange.com/q/2950/262180), and I'm probably guilty of such edits myself. — jbaums, Jan 06 '16 at 01:49
@jbaums In truth, I usually would not bother such an edit, but I found this question in the first post queue so thought it sensible to mention this to the OP. In truth, I just assumed they were considered inappropriate as I had them removed from several of my early posts myself in the past by experienced users. — Vality, Jan 06 '16 at 01:49
@JennyD - I don't see how your formula for `b` leads to the values shown in your example output. Can you show how you arrive at 0.7142 for row 2, and 3.0393 for row 3? — jbaums, Jan 06 '16 at 03:52
Looking at the solutions, I guess it's a typo. You should correct that for clarity to future generations. — jbaums, Jan 06 '16 at 05:00

G. Grothendieck · Accepted Answer · 2016-01-06T11:20:19.953

1) for loop Normally one would just use a simple loop for this:

MyData <- data.frame(A = c(5, 10, 15, 20))


MyData$b <- 0
n <- nrow(MyData)
if (n > 1) for(i in 2:n) MyData$b[i] <- ( MyData$A[i] + 13 * MyData$b[i-1] )/ 14
MyData$b[1] <- NA

giving:

> MyData
   A         b
1  5        NA
2 10 0.7142857
3 15 1.7346939
4 20 3.0393586

2) Reduce It would also be possible to use Reduce. One first defines a function f that carries out the body of the loop and then we have Reduce invoke it repeatedly like this:

f <- function(b, A) (A + 13 * b) / 14
MyData$b <- Reduce(f, MyData$A[-1], 0, acc = TRUE)
MyData$b[1] <- NA

giving the same result.

This gives the appearance of being vectorized but in fact if you look at the source of Reduce it does a for loop itself.

3) filter Noting that the form of the problem is a recursive filter with coefficient 13/14 operating on A/14 (but with A[1] replaced with 0) we can write the following. Since filter returns a time series we use c(...) to convert it back to an ordinary vector. This approach actually is vectorized as the filter operation is performed in C.

MyData$b <- c(filter(replace(MyData$A, 1, 0)/14, 13/14, method = "recursive"))
MyData$b[1] <- NA

again giving the same result.

Note: All solutions assume that MyData has at least 1 row.

added 2nd and 3rd approaches – G. Grothendieck Jan 06 '16 at 05:17 — G. Grothendieck, Jan 06 '16 at 05:17

NGaffney · Answer 2 · 2016-01-06T04:26:05.287

There are a couple of ways you could do this.

The first method is a simple loop

df <- data.frame(A = seq(5, 25, 5))
df$b <- 0

for(i in 2:nrow(df)){
  df$b[i] <- (1/14)*df$A[i]+(13/14)*df$b[i-1]
}

df
A         b
1  5 0.0000000
2 10 0.7142857
3 15 1.7346939
4 20 3.0393586
5 25 4.6079758

This doesn't give the exact values given in the expected answer, but it's close enough that I've assumed you made a transcription mistake. Note that we have to assume that we can take the NA in df$b[1] as being zero or we get NA all the way down.

If you have heaps of data or need to do this a bunch of time the speed could be improved by implementing the code in C++ and calling it from R.

The second method uses the R function sapply

The form you present the problem in

$b_i = (1/14)A_i + b_{i-1}$

is recursive, which makes it impossible to vectorise, however we can do some maths and find that it is equivalent to

$b_i=\frac{1}{14}\sum_{j=1}^{j=i}{\left( \frac{13}{14}\right)^{(i-j)}A_j}$

We can then write a function which calculates b_i and use sapply to calculate each element

calc_b <- function(n,A){
  (1/14)*sum((13/14)^(n-1:n)*A[1:n])
}

df2 <- data.frame(A = seq(10,25,5))
df2$b <- sapply(seq_along(df2$A), calc_b, df2$A)
df2
A         b
1 10 0.7142857
2 15 1.7346939
3 20 3.0393586
4 25 4.6079758

Note: We need to drop the first row (where A = 5) in order for the calculation to perform correctly.

Referring to previous row in calculation

2 Answers2

Linked