39

I have following dataframe and vector:

ddf = data.frame(a=rep(1,10), b=rep(2,10))
xx = c("c", "d", "e", "f")

How can I add to the dataframe new empty columns which are named with items in xx ?

I tried following but it does not work:

ddf = cbind(ddf, data.frame(xx))
Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 10, 4
 

Following also does not work:

for(i in 1:length(xx)){
    ddf$(xx[i]) = ""  
}

Error: unexpected '(' in:
"for(i in 1:length(xx)){
ddf$("
 }
Error: unexpected '}' in "}"
rnso
  • 23,686
  • 25
  • 112
  • 234

2 Answers2

66

This will get you there:

ddf[xx] <- NA

#   a b  c  d  e  f
#1  1 2 NA NA NA NA
#2  1 2 NA NA NA NA
#3  1 2 NA NA NA NA
#...

You can't directly use something like ddf$xx because this will try to assign to a column called xx rather than interpreting xx. You need to use [ and [<- functions, using the square brackets when you are dealing with a character string/vector - like ddf["columnname"] or ddf[c("col1","col2")], or a stored vector like your ddf[xx].

The reason why it selects columns is because data.frames are lists essentially:

is.list(ddf)
#[1] TRUE

as.list(ddf)
#$a
# [1] 1 1 1 1 1 1 1 1 1 1
# 
#$b
# [1] 2 2 2 2 2 2 2 2 2 2

...with each column corresponding to a list entry. So if you don't use a comma to specify a row, like ddf["name",] or a column like ddf[,"name"], you get the column by default.


In the case that you are working with a 0-row dataset, you can not use a value like NA as the replacement. Instead, replace with list(character(0)) where character(0) can be substituted for numeric(0), integer(0), logical(0) etc, depending on the class you want for your new columns.

ddf <- data.frame(a=character())
xx <- c("c", "d", "e", "f")
ddf[xx] <- list(character(0))
ddf
#[1] a c d e f
#<0 rows> (or 0-length row.names)
thelatemail
  • 91,185
  • 12
  • 128
  • 188
  • It doesn't work if the dataframe has zero observations, eg ddf=data.frame(a=character()), ddf[xx] <- NA, Error in value[[jvseq[[jjj]]]] : subscript out of bounds. – Luke Jun 25 '20 at 15:02
  • @Luke - interesting, though it doesn't seem right to try to add a length 1 value to a 0 row dataset. This should work: `ddf[xx] <- list(character())` – thelatemail Jun 25 '20 at 18:31
  • Interesting to notice that the assignment with list() also works if the dataframe does have observations. So I'd say that that is the general way, in some cases you can use the assignment with NA. – Luke Jun 26 '20 at 08:21
  • In the general way if you want to fill the new columns with NA call list(character()), if you want to fill the new columns with empty string call list(character(nrow(df))), if you want to initialize them with some value list(rep('hello',nrow(df))). – Luke Jun 26 '20 at 08:30
  • @Luke - assigning with a `list(NA)` instead of just `NA` should always be ok, since `data.frame`s are `list`s anyway. You shouldn't have to repeat the initial value `nrow(ddf)` times as R should recycle and take care of that part. I'm genuinely surprised that `list(character(0))` fills with `NA`s in a similar way though. I guess you learn something new every day. – thelatemail Jun 26 '20 at 08:40
9

This seems to succeed:

> cbind(ddf, setNames( lapply(xx, function(x) x=NA), xx) )
   a b  c  d  e  f
1  1 2 NA NA NA NA
2  1 2 NA NA NA NA
3  1 2 NA NA NA NA
4  1 2 NA NA NA NA
5  1 2 NA NA NA NA
6  1 2 NA NA NA NA
7  1 2 NA NA NA NA
8  1 2 NA NA NA NA
9  1 2 NA NA NA NA
10 1 2 NA NA NA NA
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Well, that is slick, now, ain't it? – IRTFM Aug 18 '14 at 06:13
  • @thelatemail: ddf[xx] works but how? I tried ddf$xx and ddf$(xx) but they do not work. Also why is xx in ddf[xx] taken as a column rather than rows? Please give as an answer since in comments it will be crowded. I will accept it. – rnso Aug 18 '14 at 06:16