-2

I wanted to ask for your advice and guidance for coding in R. Specifically, my goals are:

  1. Make the column Previous-month-value stability (see example below): The question is 'for how long was the previous month value stable consecutively in months?'

  2. Make the column "Stability" (see example below): I'm trying to show 'after how many consecutive months was there a change in the IV'. For instance, in Month 10 for Group 1, the change (from 0.2 to 0.4) occurred after 2 consecutive months (Month 8, 9) of stable IV.

  3. Make the column "Change in IV" (see example below): I want to show the amount of change only for consecutive months. For instance, Month 6 to 8 for Group 1 would be 'n/a' because there is no Month 7 for this group.

Currently, I have the first three columns ("Group", "Month", "IV"):

structure(list(Group = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2), Month = c(3, 
4, 5, 6, 8, 9, 10, 5, 6, 7, 9), IV = c(0.1, 0.1, 0.5, 0.2, 0.2, 
0.2, 0.4, 0.3, 0.4, 0.4, 0.4)), class = "data.frame", row.names = c(NA, 
-11L), codepage = 65001L)

The end results would look like:

structure(list(Group = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2), Month = c(3, 
4, 5, 6, 8, 9, 10, 5, 6, 7, 9), IV = c(0.1, 0.1, 0.5, 0.2, 0.2, 
0.2, 0.4, 0.3, 0.4, 0.4, 0.4), Previous_month_stability = c(NA, 
1, 2, 1, NA, 1, 2, NA, 1, 1, NA), Change_IV = c(NA, 0, 0.4, -0.3, 
NA, 0, 0.2, NA, 0.1, 0, NA), Stability2 = c(NA, 0, 2, 1, NA, 
0, 2, NA, 1, 0, NA)), class = "data.frame", row.names = c(NA, 
-11L), codepage = 65001L)

In a table, it would look like:

╔═══════╦═══════╦══════╦══════════════════════════╦══════════════╦════════════╗
║ Group ║ Month ║ IV   ║ Previous_month_stability ║ Change in IV ║ Stability2 ║
╠═══════╬═══════╬══════╬══════════════════════════╬══════════════╬════════════╣
║ 1     ║ 3     ║ 0.10 ║ n/a                      ║ n/a          ║ n/a        ║
╠═══════╬═══════╬══════╬══════════════════════════╬══════════════╬════════════╣
║ 1     ║ 4     ║ 0.10 ║ 1                        ║ 0            ║ 0          ║
╠═══════╬═══════╬══════╬══════════════════════════╬══════════════╬════════════╣
║ 1     ║ 5     ║ 0.50 ║ 2                        ║ 0.40         ║ 2          ║
╠═══════╬═══════╬══════╬══════════════════════════╬══════════════╬════════════╣
║ 1     ║ 6     ║ 0.20 ║ 1                        ║ -0.30        ║ 1          ║
╠═══════╬═══════╬══════╬══════════════════════════╬══════════════╬════════════╣
║ 1     ║ 8     ║ 0.20 ║ n/a                      ║ n/a          ║ n/a        ║
╠═══════╬═══════╬══════╬══════════════════════════╬══════════════╬════════════╣
║ 1     ║ 9     ║ 0.20 ║ 1                        ║ 0            ║ 0          ║
╠═══════╬═══════╬══════╬══════════════════════════╬══════════════╬════════════╣
║ 1     ║ 10    ║ 0.40 ║ 2                        ║ 0.2          ║ 2          ║
╠═══════╬═══════╬══════╬══════════════════════════╬══════════════╬════════════╣
║ 2     ║ 5     ║ 0.30 ║ n/a                      ║ n/a          ║ n/a        ║
╠═══════╬═══════╬══════╬══════════════════════════╬══════════════╬════════════╣
║ 2     ║ 6     ║ 0.40 ║ 1                        ║ 0.10         ║ 1          ║
╠═══════╬═══════╬══════╬══════════════════════════╬══════════════╬════════════╣
║ 2     ║ 7     ║ 0.40 ║ 1                        ║ 0            ║ 0          ║
╠═══════╬═══════╬══════╬══════════════════════════╬══════════════╬════════════╣
║ 2     ║ 9     ║ 0.40 ║ n/a                      ║ n/a          ║ n/a        ║
╚═══════╩═══════╩══════╩══════════════════════════╩══════════════╩════════════╝
user14250906
  • 197
  • 8
  • To make sure others can really help you, please read into [how to make a reproducible example in R](https://stackoverflow.com/q/5963269/1335174) - for example using `dput`. Your table looks fancy but is really not useful to copy into an R environment and test a solution. – alex_jwb90 Sep 10 '20 at 00:00
  • Thank you so much, @alex_jwb90 for your comment. I updated the question accordingly! – user14250906 Sep 10 '20 at 02:13

1 Answers1

1

To create both columns, "Change" and "Stability in IV", try the following, using dplyr:

library(dplyr)
library(zoo)

df2 <- df %>%
  mutate(
    Group = na.locf(Group)
  ) %>%
  arrange(Group, Month) %>%
  group_by(Group) %>%
  mutate(
    tmp_continuous_block = cumsum(coalesce(Month - lag(Month) > 1, F))
  ) %>%
  group_by(Group, tmp_continuous_block) %>%
  mutate(
    Change = IV - lag(IV),
    `Previous Month Stability` = case_when(
      is.na(Change) ~ NA_integer_,
      T ~ lag(sequence(rle(IV)$lengths))
    ),
    `Stability in IV` = ifelse(Change == 0, 0L, `Previous Month Stability`)
  ) %>%
  ungroup() %>%
  select(-starts_with("tmp_"))
alex_jwb90
  • 1,663
  • 1
  • 11
  • 20
  • Sorry for some hickups. This works now on my end, after making use of the `rle` function in a similar way as suggested [by Mark over here](https://stackoverflow.com/a/18669187/1335174) – alex_jwb90 Sep 10 '20 at 00:02
  • Thank you so much for your guidance. I added another column in the example above, "Previous_month_stability," which indicates the duration of the previous-month-IV being unchanged. I was wondering how the following part could be modified! ***Stability in IV` = case_when(is.na(Change) ~ NA_integer_, Change == 0 ~ 0L, T ~ lag(sequence(rle(IV)$lengths)*** – user14250906 Sep 11 '20 at 05:04
  • The way I understand this column, it is the same as `Stability in IV`, but it is not zeroed for the months while they are still stable, right? Then just leave the `Change == 0 ~ 0L,` out. If you create it in the mutate before `Stability in IV`, you can create that column as ` ```Stability in IV`` = ifelse(Change == 0, 0L, `Previous Month Stability`) ` – alex_jwb90 Sep 11 '20 at 07:46
  • Um, I messed up the backticks above. You'll get it, I think :) – alex_jwb90 Sep 11 '20 at 07:53
  • Thank you so much! I learn a lot from you. For group_by(tmp_continuous_block), I added Group--group_by(Group, tmp_continuous_block), which solved the issue that I came across where the 'Previous Month Stability" was not computed for the second group! – user14250906 Sep 17 '20 at 02:06