0

I'm creating a new data.frame by doing the opposite of "flattening" an input data.frame (in other words going from "wide" to "narrow", creating more rows).

I'll be looping over columns of the input data.frame, and appending to the output data.frame. I know it's more efficient to create the full output data.frame outright and fill it within the loop, but my question is why is it possible to create a 0 x 4 data.frame, but apparently not possible to name those 4 columns... Thanks.

 dff <- data.frame()
 dim( dff ) <- c(0,4)
 colnames(dff) <- c("first","second","third","fourth")

 Error in `colnames<-`(`*tmp*`, value = c("first", "second", "third", "fourth" : 
   'names' attribute [4] must be the same length as the vector [0]
user2105469
  • 1,413
  • 3
  • 20
  • 37
  • 1
    I don't think there is `dim<-` method for `data.frame`s, only `dim`. Your `dim( dff ) <- c(0,4)` didn't do nothing (did you even check `dff` after that line?), hence `colnames<-` didn't work too – David Arenburg Aug 10 '14 at 20:12
  • Use `names` instead of `colnames`, or better use `setNames`. – Thomas Aug 10 '14 at 20:12
  • @Thomas, nothing of it will work in his case, unless he will define `dff` differently – David Arenburg Aug 10 '14 at 20:14
  • That's correct. The call to `dim` did nothing. `names`, `dimnames` with and without indexing generated errors. http://stackoverflow.com/questions/10689055/create-an-empty-data-frame seems the be the right way of initializing the data.frame. – user2105469 Aug 10 '14 at 20:19

1 Answers1

1

Here are four possibilities (I'm sure there are also others):

> data.frame(first=numeric(), second=numeric(), third=numeric(), fourth=numeric())
[1] first  second third  fourth
<0 rows> (or 0-length row.names)

> data.frame(first=1,second=1,third=1,fourth=1)[0,]
[1] first  second third  fourth
<0 rows> (or 0-length row.names)

> as.data.frame(matrix(nrow=0,ncol=4,dimnames=list(c(),c("first","second","third","fourth"))))
[1] first  second third  fourth
<0 rows> (or 0-length row.names)

> setNames(as.data.frame(matrix(nrow=0,ncol=4)), c("first","second","third","fourth"))
[1] first  second third  fourth
<0 rows> (or 0-length row.names)

Note that for the first solution, you can specify whatever column classes you want (e.g., replacing numeric() with character(), etc.).

Also, you can't specify the dim attribute of a data.frame because data.frames do not have a dim attribute. Rather, they are a list structure with a row.names attribute. The str function can be helpful for understanding what these objects are.

Thomas
  • 43,637
  • 12
  • 109
  • 140
  • It's also important to note that creating an empty data.frame and adding rows after the fact it just a bad idea in general. Looping and appending is a bad strategy when it sounds like the problem is really about reshaping. Better to use the `reshape()` function or the `reshape2` package. – MrFlick Aug 10 '14 at 21:15
  • @MrFlick Yes, I leave this post without comment about the logic of doing any of these things. – Thomas Aug 10 '14 at 22:04