0

I have been calculating percentage change in carbon stocks in plots in counties over multiple years.

|COUNTY|PLOT|INVYR|PCT_CHNG|
|------|----|-----|--------|
|1     |1   |2010 |5       |
|1     |1   |2013 |7       |
|1     |2   |2012 |-4      |
|1     |2   |2017 |5       |
|1     |3   |2010 |9       |

The dataset above isn't the actual dataset but has all it's basic qualities (I am new to stack exchange!) The central problem is when I calculate percentage change across counties and plots I need to find a way of resetting the percentage calculation whenever the PLOT changes, the first reading of a PCT_CHNG for a PLOT should always be NA because it's the first one in a "new" sequence of PLOTs. Calculating it looked like this:

x <- as.data.frame(Overstory.C) %>%
  group_by(POOL) %>%
  arrange(COUNTYCD, PLOT, INVYR, .by_group = TRUE) %>%
  mutate(pct_chng = (C_kg_m2 / lag(C_kg_m2) - 1)*100)

In an attempt to remedy this problem I have tried:

x$pct_chng <- x$pct_chng[x$PLOT != lag(x$PLOT)] <- NA

But this just replaces all of the PCT_CHNGs to NA rather than in the rows when the PLOT ID changes

And:

for (i in 1:seq_along(x$PLT_CN)) { ### PLT_CN = a unique identifying code in the dataset ###
  if (x$PLOT[i] != lag(x$PLOT[i])){
    x$pct_chng[i] <- NA
  } else {
    x$pct_chng == x$pct_chng
  }
}

But this just produces the error:

Warning: numerical expression has 99946 elements: only the first usedError in if (x$PLOT[i] != lag(x$PLOT[i])) { : missing value where TRUE/FALSE needed

Now I am at a loss as to what to do. Any help would be wonderful! let me know if I need to explain further.

  • Pls, use `dput(db)` (https://stackoverflow.com/a/49995752/11570343) to get your data and show your expected output. – cdcarrion Sep 17 '22 at 17:11

1 Answers1

0

Hard to know for sure without your data, but if it is the case that your main problem is wanting to be able to calculate percent change, resetting whenever the PLOT changes, I would use the following code:

df %>% group_by(POOL) %>% 
       mutate(pct_change = 
               if_else(PLOT == lag(PLOT), 
               C_kg_m2 - lag(C_kg_m2))/lag(C_kg_m2),
               NA_real_))

This way, if the obs is the first of the new PLOT value, it will be NA, otherwise it calculates the percent change.

victorj
  • 81
  • 4