2

I am working with R. I found this link here on creating empty data frames in R: Create an empty data.frame .

I tried to do something similar:

df <- data.frame(Date=as.Date(character()),
                 country=factor(), 
                 total=numeric(), 
                 stringsAsFactors=FALSE) 

Yet, when I try to populate it:

df$total = 7

I get the following error:

Error in `$<-.data.frame`(`*tmp*`, total, value = 7) : 
  replacement has 1 row, data has 0

df[1, "total"] <- rnorm(100,100,100)

Error in `[<-.data.frame`(`*tmp*`, 1, "total", value = c(-79.4584309347689,  : 
  replacement has 100 rows, data has 1

Does anyone know how to fix this error?

Thanks

stats_noob
  • 5,401
  • 4
  • 27
  • 83

1 Answers1

2

An option is to specify the row index

df[1, "total"] <- 7

-output

str(df)
#'data.frame':  1 obs. of  3 variables:
# $ Date   : Date, format: NA
# $ country: Factor w/ 0 levels: NA
# $ total  : num 7

The issue is that when we select a single column and assign on a 0 row dataset, it is not automatically expanding the row for other columns. By specifying the row index, other columns will automatically filled with default NA

Regarding the second question (updated), a standard data.frame column is a vector and the length of the vector should be the same as the index we are specifying. Suppose, we want to expand to 100 rows, change the index accordingly

df[1:100, "total"] <- rnorm(100, 100, 100) # length is 100 here
dim(df)
#[1] 100   3

Or if we need to cram everything in a single row, then wrap the rnorm in a list

df[1, "total"] <- list(rnorm(100, 100, 100))

In short, the lhs should be of the same length as the rhs. Another case is when we are assigning from a different dataset

df[seq_along(aa$bb), "total"] <- aa$bb 

This can also be done without initialization i.e.

df <- data.frame(total = aa$bb)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • But why is this necessary? I tried this as well : df[1, "total"] <- rnorm(100,100,100) Error in `[<-.data.frame`(`*tmp*`, 1, "total", value = c(-79.4584309347689, : replacement has 100 rows, data has 1 ... do you know why this is happening? Thank you – stats_noob Jan 10 '21 at 20:42
  • 1
    @Noob You are adding 100 elements on a single row. It can be a list though i.e. `list(rnorm(100, 100, 100))`. Please note that a standard data.frame column is a `vector` and it assumes the vector to be same length as the index for row. If you need to add 100 rows, use `df[1:100, "total"] <- rnorm(100, 100, 100)` – akrun Jan 10 '21 at 20:43
  • thank you for all your help. Last question : suppose I already have another data frame that exists (called "aa" with a column "bb"). Will this work? df[1, "total"] <- aa$bb – stats_noob Jan 10 '21 at 20:46
  • 1
    @Noob Depends on the length 'bb' you may need `df[seq_along(aa$bb), "total"] <- aa$bb` – akrun Jan 10 '21 at 20:47