Create a data.frame with m columns and 2 rows

Question

I would like to create a data.frame in R with m (a variable) number of columns (for example 30), and 2 rows and fill all the values in the data.frame initially with 0's. It seems as though data.frame populates values based on rows rather that columns, any suggestions how I can do this? Thanks :)

score 69 · Answer 1 · answered May 23 '11 at 22:34

69

Does m really need to be a data.frame() or will a matrix() suffice?

m <- matrix(0, ncol = 30, nrow = 2)

You can wrap a data.frame() around that if you need to:

m <- data.frame(m)

or all in one line: m <- data.frame(matrix(0, ncol = 30, nrow = 2))

answered May 23 '11 at 22:34

Chase

67,710
18
144
161

And that was exactly what _I_ was going to write. +1 ;) My only addition would be to explicitly point out that `data.frame()` typically specifies data by column via its tag=value arguments. – joran May 23 '11 at 22:39

Greg · Answer 2 · 2011-05-24T01:11:05.890

40

For completeness:

Along the lines of Chase's answer, I usually use as.data.frame to coerce the matrix to a data.frame:

m <- as.data.frame(matrix(0, ncol = 30, nrow = 2))

EDIT: speed test data.frame vs. as.data.frame

system.time(replicate(10000, data.frame(matrix(0, ncol = 30, nrow = 2))))
   user  system elapsed 
  8.005   0.108   8.165 

system.time(replicate(10000, as.data.frame(matrix(0, ncol = 30, nrow = 2))))
   user  system elapsed 
  3.759   0.048   3.802

Yes, it appears to be faster (by about 2 times).

edited May 24 '11 at 01:11

answered May 24 '11 at 00:27

Greg

11,564
5
41
27

what's different about your answer? is it faster? – Eduardo Leoni May 24 '11 at 01:01
@Eduardo it appears to be a little faster - see above. – Greg May 24 '11 at 01:13

score 0 · Answer 3 · answered Jan 06 '22 at 05:52

If you want to create a data.frame with a certain number of rows, and you already know the column names, you can define the make_df() function like so:

make_df <- function(nrow) {
  if (missing(nrow)) {
    nrow <- 0
  }
  
  temp_df <- data.frame(
    word = character(),
    meaning_grouping = integer(),
    part_of_speech = character(),
    phonetic = character(),
    audio = character(),
    origin = character(),
    definition = character(),
    examples = character(),
    synonyms = character(),
    antonyms = character(),
    stringsAsFactors = FALSE
  )
  
  if (nrow > 0) {
    temp_df[1:nrow,] <- NA
  }
  
  temp_df
}

It allows you to populate both the column names and column classes (e.g. character, integer, numeric etc).

To create a data.frame with 5 rows, for example:

make_df(nrow=5)
  word meaning_grouping part_of_speech phonetic audio origin definition examples synomyms antonyms
1 <NA>               NA           <NA>     <NA>  <NA>   <NA>       <NA>     <NA>     <NA>     <NA>
2 <NA>               NA           <NA>     <NA>  <NA>   <NA>       <NA>     <NA>     <NA>     <NA>
3 <NA>               NA           <NA>     <NA>  <NA>   <NA>       <NA>     <NA>     <NA>     <NA>
4 <NA>               NA           <NA>     <NA>  <NA>   <NA>       <NA>     <NA>     <NA>     <NA>
5 <NA>               NA           <NA>     <NA>  <NA>   <NA>       <NA>     <NA>     <NA>     <NA>

Or to create a data.frame with 0 rows:

make_df()
 [1] word             meaning_grouping part_of_speech   phonetic         audio            origin           definition       examples        
 [9] synomyms         antonyms        
<0 rows> (or 0-length row.names)

Create a data.frame with m columns and 2 rows

3 Answers3

Linked