Create new variables based on list, then populate based on whether row contains variable name

Question

I have some data:

df = data.frame(matrix(rnorm(20), nrow=10))
         X1          X2
1   1.17596402  0.06138821
2  -1.76439330  1.03674803
3  -0.39069424  0.61616793
4   0.68375346  0.27435354
5   0.27426476 -1.71226109
6  -0.06153577  1.14514453
7  -0.37067621 -0.61243104
8   1.11107852  0.47788971
9  -1.73036658  0.31545148
10 -1.83155718 -0.14433432

I want to add new variables to it for every element in a list, which changes:

list = c("a","b","c")

The result should be:

           X1          X2  a  b  c
1   1.17596402  0.06138821 NA NA NA
2  -1.76439330  1.03674803 NA NA NA
3  -0.39069424  0.61616793 NA NA NA
4   0.68375346  0.27435354 NA NA NA
5   0.27426476 -1.71226109 NA NA NA
6  -0.06153577  1.14514453 NA NA NA
7  -0.37067621 -0.61243104 NA NA NA
8   1.11107852  0.47788971 NA NA NA
9  -1.73036658  0.31545148 NA NA NA
10 -1.83155718 -0.14433432 NA NA NA

I can do this using suggestions below:

df[list] <- NA

But now, I want to search every row for the variable name as a value and flag if it contains that value. For example:

   X1 X2 a b c
1   a  b 1 1 0
2   a  c 1 0 1

So the code would search for "a" in all columns and flag if any column contains "a". How do I do this?

*But now, I want to search every row for the variable name as a value and flag if it contains that value.* This is a completely different question. you should ask a separate question with a link to this one. I will vote to leave closed. — Rui Barradas, Dec 02 '18 at 09:46

Sven Hohenstein · Answer 1 · 2018-12-01T15:23:13.067

2

You can use

df[list] <- NA

The result:

            X1          X2  a  b  c
1  -2.07205164 -0.93585363 NA NA NA
2   1.11014587  0.23468072 NA NA NA
3  -1.17909665  0.04741478 NA NA NA
4   0.23955056  1.02029880 NA NA NA
5  -0.79212220 -1.13485661 NA NA NA
6  -0.57571547  0.33069641 NA NA NA
7  -0.70063920 -0.17251563 NA NA NA
8   1.90625189  0.30277177 NA NA NA
9   0.09029121 -0.72104778 NA NA NA
10 -1.36324313 -1.48041873 NA NA NA

If you want to add only the variables that are not present in df, you can use:

df[list[!list %in% names(df)]] <- NA

edited Dec 01 '18 at 15:23

answered Dec 01 '18 at 15:17

Sven Hohenstein

80,497
17
145
168

Thank you. I've updated the question to include your solution and more clearly articulate my second problem, which is that each of the new variables must take value 1,0 pending whether the variable name is in the row. The new example should clarify this. – macworthy Dec 02 '18 at 03:26

Create new variables based on list, then populate based on whether row contains variable name

1 Answers1