I'm quite new to R, so please be gentle.
I have a dataset that contains multiple columns that are repeated measures for several years. I have and ID-variable that identify the specific individual per row that these measures relate to.
I also have a variable that include a year per row.
So my data looks something like this
ID | Y2005_N0X | Y2006_N0X | Y2007_N0X | Y2008_N0X | Y2005_N06 | Y2006_N06 | Y2007_N06 | Y2008_N0X | YEAR |
---|---|---|---|---|---|---|---|---|---|
1 | 5 | 6 | 7 | 8 | 9 | 94 | 29 | 69 | 2005 |
2 | 6 | 7 | 8 | 9 | 9 | 39 | 59 | 39 | 2007 |
and so forth...
I want to create a new column that takes the value from 2005 for ID 1, and the value from 2007 from ID 2, and names this new column prioryear_N0X.
I have several of these columns so I want to create a command that generate multiple new columns based on those conditions, so that I get a new column called prioryear_N1X, prioryear_N06, and so forth. It is also crucial that the ID column are intact.
After this I also want to create new columns called thisyear_N0X and so forth which takes the value from YEAR+1 (for example 2004+1=2005) and match to the present variables called Y2005_N0X and so forth.
ID | Y2005_N0X | Y2006_N0X | Y2007_N0X | Y2008_N0X | Y2005_N06 | Y2006_N06 | Y2007_N06 | Y2008_N0X | YEAR | prioryear_N0X | prioryear_N06 | thisyear_N0X | thisyear_N06 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 5 | 6 | 7 | 8 | 9 | 94 | 29 | 69 | 2005 | 5 | 9 | 6 | 94 |
2 | 6 | 7 | 8 | 9 | 9 | 39 | 59 | 39 | 2007 | 8 | 59 | 9 | 39 |
Suggestions?
How do I use the command if I have variables that also starts with R?
I have tried the following, but receive an error.
df %>%
pivot_longer(cols = -c(ID, YEAR), names_to = c("header_year", ".value"), names_pattern = "Y(\\d+)_(\\w+)", names_transform = list(header_year = as.integer)) %>%
group_by(ID) %>%
mutate(across(starts_with("N", "R"), function(x) { x[YEAR == header_year] }, .names = "prioryear_{.col}"),
across(starts_with("N", "R"), function(x) { x[YEAR + 1 == header_year] }, .names = "thisyear_{.col}")) %>%
pivot_wider(id_cols = c(ID, YEAR, starts_with("prioryear"), starts_with("thisyear")), names_from = header_year, values_from = starts_with("N", "R"))