I have been calculating percentage change in carbon stocks in plots in counties over multiple years.
|COUNTY|PLOT|INVYR|PCT_CHNG|
|------|----|-----|--------|
|1 |1 |2010 |5 |
|1 |1 |2013 |7 |
|1 |2 |2012 |-4 |
|1 |2 |2017 |5 |
|1 |3 |2010 |9 |
The dataset above isn't the actual dataset but has all it's basic qualities (I am new to stack exchange!) The central problem is when I calculate percentage change across counties and plots I need to find a way of resetting the percentage calculation whenever the PLOT changes, the first reading of a PCT_CHNG for a PLOT should always be NA because it's the first one in a "new" sequence of PLOTs. Calculating it looked like this:
x <- as.data.frame(Overstory.C) %>%
group_by(POOL) %>%
arrange(COUNTYCD, PLOT, INVYR, .by_group = TRUE) %>%
mutate(pct_chng = (C_kg_m2 / lag(C_kg_m2) - 1)*100)
In an attempt to remedy this problem I have tried:
x$pct_chng <- x$pct_chng[x$PLOT != lag(x$PLOT)] <- NA
But this just replaces all of the PCT_CHNGs to NA rather than in the rows when the PLOT ID changes
And:
for (i in 1:seq_along(x$PLT_CN)) { ### PLT_CN = a unique identifying code in the dataset ###
if (x$PLOT[i] != lag(x$PLOT[i])){
x$pct_chng[i] <- NA
} else {
x$pct_chng == x$pct_chng
}
}
But this just produces the error:
Warning: numerical expression has 99946 elements: only the first usedError in if (x$PLOT[i] != lag(x$PLOT[i])) { : missing value where TRUE/FALSE needed
Now I am at a loss as to what to do. Any help would be wonderful! let me know if I need to explain further.