0

I would like to create a new column esg.ordered$flowpct <- esg.ordered$flow[i]/lag(esg.ordered$size[i]) for my data frame esg only if the value (/name) in a certain row has the same value (/name) as in the previous row in column fundid. Otherwise the value in column flowpct should have "NA" in the respective rows. Here is my code:

for (i in esg.ordered) {
  if(esg.ordered$fundid[i]==lag(esg.ordered$fundid[i],n=1)){
    esg.ordered$flowpct <- esg.ordered$flow[i]/lag(esg.ordered$size[i])
  }else{
    esg.ordered$flowpct <- "NA"
  }
}

Unfortunately, I get two mistakes:

  1. Error in if (esg.ordered$fundid[i] == lag(esg.ordered$fundid[i], n = 1)) { : missing value where TRUE/FALSE needed

  2. Warning: In if (esg.ordered$fundid[i] == lag(esg.ordered$fundid[i], n = 1)) { : the condition has length > 1 and only the first element will be used

Can you guys help me solving these mistakes?

Here is the data

fundid size flow
FS00008KNP 78236537 7038075.43
FS00008KNP 73048868 -5691940.56
FS00008KNP 74688822 -193188.79
FS00008KNP 95330799 11991514.11
FS00008L0W 44170465 -15706588.66
FS00008L0W 33278560 -12749545.90
FS00008L0W 26084262 -6879079.19
FS00008L0W 23857701 -3227825.03
CodingGirl
  • 13
  • 5
  • Can you provide us with a [Minimal Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example)? – DeBARtha Sep 15 '21 at 10:10
  • Welcome to SO. You also appear to be attempting to store both chracater and numeric values in `flowpct`. That's likely to cause problems later on. Some sample data would help us to help you. [This post](https://stackoverflow.com/help/minimal-reproducible-example) may help you create a minimum working example. Finally, a good rule of thumb when using R is "if I'm using a loop, there's probably a better way to do it. This, I suspect, is a case in point. – Limey Sep 15 '21 at 10:12

1 Answers1

0
  1. "NA" is not the same as NA (which might be appropriate there.

  2. for (i in esg.ordered) is wrong: it is iterating over each column in your frame named esg.ordered, so i is a full vector. I think you mean for (i in seq_len(nrow(esg.ordered))).

  3. The error missing value where TRUE/FALSE needed is easily searched and should return (among other links) Error in if/while (condition) {: missing Value where TRUE/FALSE needed. It is because the conditional in if is returning NA.

  4. You appear to be doing something on a whole vector at a time, this is a literal translation of what you are trying to do (but without the for loop):

    esg.ordered$flowpct <- ifelse(
      c(TRUE, esg.ordered$fundid[-1] == esg.ordered$fundid[-nrow(esg.ordered)]),
      esg.ordered$flow / c(NA, esg.ordered$size[-nrow(esg.ordered)]),
      NA)
    esg.ordered
    #       fundid     size        flow      flowpct
    # 1 FS00008KNP 78236537   7038075.4           NA
    # 2 FS00008KNP 73048868  -5691940.6 -0.072752972
    # 3 FS00008KNP 74688822   -193188.8 -0.002644651
    # 4 FS00008KNP 95330799  11991514.1  0.160552996
    # 5 FS00008L0W 44170465 -15706588.7           NA
    # 6 FS00008L0W 33278560 -12749545.9 -0.288644140
    # 7 FS00008L0W 26084262  -6879079.2 -0.206712045
    # 8 FS00008L0W 23857701  -3227825.0 -0.123746075
    

    However, the relies wholly on fundid being ordered correctly. I think a safer way to go is this, using ave to get the last size within the current fundid, and then dividing:

    esg.ordered$flowpct <- with(esg.ordered,
      flow / ave(size, fundid, FUN = function(z) c(NA, z[-length(z)])))
    

    Same results as above, but much safer.

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Thank you for helping me :) trying my best to understand your suggestions and I will give you my data in a sec – CodingGirl Sep 15 '21 at 12:08
  • See the edit. This is still a dupe of the two links provided (the error and the warning), I suggest you click on the dupe-links and read up on several of the excellent answers/comments about the problems. As for the calculation itself, I offer two suggestions, I urge you to use the second (`ave`) as it is much more robust to ordering of the data (i.e., `fundid` does not need to be contiguous/ordered). – r2evans Sep 15 '21 at 13:33
  • 1
    Thank you so much again, I have tried your suggestion and it works! You really saved the day!! :) – CodingGirl Sep 15 '21 at 13:39